Docker has had support for adding labels to images for a while. From this we get some handy features: everything from being able to filter running containers on the command line to providing hints for a scheduler. Since labels are also exposed via the API, the potential for tools to be built atop good metadata is huge. Take the handy label browsing features of MicroBadger, or the ability to provide minimum-resource information to OpenShift.
In taking advantage of these upsides, we want to avoid having duplicate metadata for every tool. That means we need to come to some level of agreement about label names and formats for some common attributes. In my DockerCon EU talk at the end of last year, I made a call for a shared community namespace to start building that agreement.
Puppet and Metadata
Why is Puppet interested in metadata? Well, lots of Puppet’s features rely on low-level tools providing good data. Package managers like RPM, APK, or APT are incredibly useful and powerful. This isn’t because of the file format (often just a tar file), but because of the information in the manifest or spec file that is made available through a standard API. This shared agreement on the package metadata allows for powerful tools like Puppet to be built on top. So obviously we’d be interested in helping build a shared agreement around container labels.
Enter Label Schema
Enter Label Schema. A number of people within the container community had similar thoughts, and over time a group of users and software vendors came together at events and online. This included folks from Mesosphere and Puppet, as well as Container Solutions, Weave, Microscaling and more.
The initial focus has been on a small (and hopefully obvious) set of labels under the
org.label-schema namespace. We’re trying to walk before we run, and are mainly focusing on things that people are already doing (just currently under a variety of namespaces and in inconsistent ways). This is definitely a pave-the-cowpaths effort.
At Container Camp in London, we announced a release candidate and are seeking much broader input to a set of shared labels. We think we have the scope down for v1, but I’m sure we’ll see some of the details change a little with more input from a wider group of users.
What does this mean in practice? Here’s an example of the kind of labels we’ll soon be adding to our images on Docker Hub.
LABEL org.label-schema.vendor="Puppet" \ org.label-schema.url="https://github.com/puppetlabs/puppetserver" \ org.label-schema.name="Puppet Server" \ org.label-schema.version="2.6.0" \ org.label-schema.vcs-url="github.com:puppetlabs/puppetserver.git" \ org.label-schema.vcs-ref="ebd57d487d209bf575fcce26335c8f3e0ad09288" \ org.label-schema.build-date="2016-09-8T23:20:50.52Z" \ org.label-schema.docker.schema-version="1.0"
The release candidate describes the labels that exist under the
org.label-schema namespace, and describes the purpose and format of the values.
A really simple example of what that allows straightaway is querying for all your Puppet provided images in one go:
docker images --filter "label=org.label-schema.vendor=Puppet"
But the real advantages come as we and others build more tools atop this data. And the best thing for the user is that those tools can be interoperable, based on this simple agreement.
With the release candidate, we’re seeking as much input as possible. We have a mailing list and you can file issues or make pull requests against the label-schema.org GitHub repository. In addition to your feedback about the specification, I'd love to hear your ideas for tools that shared metadata can make possible. If you've already been building something around container labels, then do join the mailing list and let us know.
Gareth Rushgrove is a senior software engineer at Puppet.