Deep(er) dive into container labels and annotations
During the euroHPC Summit in Göteborg we discussed the latest developments and the goals we want to push within the euroHPC community. It boils down to raise awareness about container in general and synchronize the efforts of the different entities.
I gave an introduction to the HPC Container Conformance project (HPC3) in which I disected annotations and labels and how we might deal with those.
In this blog post I'll provide an overview of what annotations and labels are about and how we might implement them in the context of HPC. Everything discussed here is based on the last couple of month discussion with a bunch of folks from the container community but is not yet finalized. If you have input we should consider, please let me know!
Labels and Annotations
Labels and annotations are somewhat used synonamously in the container space. Let me first describe what they are and how they are used (so far).
The term labels might be familiar to most of you in the context of a Dockerfile. Labels are key-value pairs that are attached to an image.
A simple example of a Dockerfile with labels defines the labels foo=bar
and bar=foo
:
When you build the image and extract it in the docker format you can inspect the labels as part of the config object of the manifest (check out the (1) annotations for in-line explanations).
- I'm an annotation!
$ docker build -qt docker/label . && docker save docker/label |tar xf - -C docker
$ jq . docker/manifest.json
[
{# (1)!
"Config": "c1b96264c73fab1cb265a72fae96320d0f5e3ae70664e33befb2ea56173c03c1.json",
"RepoTags": [
"docker/label:latest"
],
"Layers": [
"f35392eca05fc88b150da5a7343b9730fecf6931f840ba29c52849cdf4448a3a/layer.tar",
"a64644f1873b2190246b0e907b7b83dd32a06d875d294b776d6d42b68f5ec96b/layer.tar"
]
}
]
$ jq .config.Labels docker/c1b96264c73fab1cb265a72fae96320d0f5e3ae70664e33befb2ea56173c03c1.json
{# (2)!
"bar": "foo",
"foo": "bar"
}
- The manifest does not know about the labels, they are nested within the
Config
object. - Inspecting the
Config
blob itself reveals the labels.
A visual representation can be seen on the left. The manifest references a json object that containes information about how the container will be started (environment variables, entrypoint, command, UID:GID definitions, etc.). The Config
object also contains the labels.
In addition the manifest also defines layers which are extracted in the order they are listed.
Use in HPC
Some runtimes already use labels in the background. Sarus for instance will use container labels to trigger OCI hooks for a given container.
OCI Annotations
In the OCI Image specification (optional) annotations for image manifest can be included.
The OCI format will create an descriptor that includes the list of manifests created (you might build multiple images with a single command for different platforms).
$ podman build -q --annotation anno=tation \ # (1)!
-f Dockerfile -t podman/labels .
cf73eacc27043d167b697dfb5fa84bfee1ca70f102a0b2eedf49ef52458e1044
$ podman save podman/labels --format oci-archive |tar xf - -C oci
$ jq . oci/index.json
{
"schemaVersion": 2,
"manifests": [
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"digest": "sha256:5b747deb18ec0f4cdef4b86086dde9aadf54ecf389c6192e6ae0dd6ef6639014",
"size": 766,
"annotations": {
"org.opencontainers.image.ref.name": "localhost/podman/labels:latest"
}
}
]
}
- By providing the
--annotation
flag topodman build
we can add annotations to the manifest.
The manifest itself is a json object that contains the config
object (similar to the Config
object in the docker format), a list of layers, and annotations.
$ jq . oci/blobs/sha256/5b747deb18ec0f4cdef4b86086dde9aadf54ecf389c6192e6ae0dd6ef6639014
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"config": {
"mediaType": "application/vnd.oci.image.config.v1+json",
"digest": "sha256:cf73eacc27043d167b697dfb5fa84bfee1ca70f102a0b2eedf49ef52458e1044",
"size": 1136
},
"layers": [
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
*snip*
"size": 182
}
],
"annotations": {
"anno": "tation", # (1)!
"org.opencontainers.image.base.digest": "sha256:60cfb06536035a143bbfbac665bae52d493c58b606818e96a7e4da180de8026c",
"org.opencontainers.image.base.name": "localhost/alpine:3.17"
}
}
$ jq .config.Labels oci/blobs/sha256/cf73eacc27043d167b697dfb5fa84bfee1ca70f102a0b2eedf49ef52458e1044
{# (2)!
"bar": "foo",
"foo": "bar",
"io.buildah.version": "1.27.0"
}
$
- The annotations are part of the manifest
- Again, you have the labels defined within the Dockerfile along with a label that is added by the build tool.
As you can see the manifest is annotated, even though the labels do not include the annotation.
Danger
The problem is that most runtimes will transform the OCI format into their own. Be it Docker like above or even HPC runtimes. The annotation will be lost eventually.
That is why we can not rely on annotations to stick around and we should use labels as ground thruth.
Reverse DNS Notation
One common practice we as HPC community might adopt is to treat labels with reverse DNS format as annotations. So that if a label with this format will be treated as annotation and might be elevated in other formats.
Idea
As labels and annotations are used synonymously I'd like to propose to call key/value paris with reverse DNS notation as annotations and everything else as labels.
Constraints
To make sure that labels and annotations are used in a consistent way, we might define some constraints to help us create annotations that are consistent and do not contradict each other.
Label/Annotation Constraint
As stated before, labels are the ground truth and annotations are optional. An manifest annotation must not contridict a label.
If a label is present, an annotation with the same key must have the same value.
Index/Manifest Constraint
Within an index annotations can be used to define commonalities among all manifests. The image on the left shows an example were the second manifest violates the constraint by having a different value for the version of the application.
A more advances use case is informing the container engine what to expect from a image index. In order to support engines (like podman) which are able to make use of the zstd compression algorithm (which is a blog post in its own), we can add an annotation to the index to inform such engines that they are going to find zstd compressed images in the index.
Conclusion
To sum up the blog post, let's collect the (proposed) take away points:
- Labels are the ground truth
- Manifest or image index annotations are optional
- highler level annotations must not contradict labels
- Index annotations can be used to define commonalities among manifests
- Annotations can be used to inform the container engine about the content of the image index.
Please let me know if you agree or (even more important) if you disagree. :)