In this article, we'll explore why it's essential to avoid storing sensitive data in the image layer when working with Docker. We'll use the Docker CLI tool to build an image from a Dockerfile and demonstrate the security risks it poses when including sensitive information.
First, let's break down the Dockerfile we'll be working with:
FROM alpine
RUN touch /secret.txt
RUN echo "sensitive-data" > /secret.txt
RUN rm /secret.txt
The
FROM
command creates a layer based on the Alpine image.The
LABEL
command modifies the image's metadata without creating a new layer.The first
RUN
command creates a new file, altering the filesystem and thus creating a new layer.The second
RUN
command writes data into the file, resulting in another new layer.Finally, the last
RUN
command removes the file and writes the result into a new layer.
Now, let's build the image using the Docker CLI:
$ docker build . -t secret
We've successfully created a new image from the Dockerfile.
If you publish this image on a popular registry like Docker Hub, keep in mind that anyone with access to the Docker image can potentially access any file included in that image. Docker stores each layer separately, meaning that even if a subsequent layer removes the secret.txt
file, the previous layer still contains the sensitive data.
Let's see that in action
$ docker run --rm -it secret /bin/sh
/ # ls
bin etc lib mnt proc run srv tmp var
dev home media opt root sbin sys usr
/ # exit
Upon running the image, you won't find the secret.txt
file in the file system. However, the file is still embedded in the image layer.
Let's explore that by converting the image into a tar file using the docker save
command:
$ mkdir secret-data
$ cd secret-data
$ docker save secret -o secret-image-unpack.tar
$ tar -xf secret-image-unpack.tar
$ ls
35c8305da895b7eb8329795724dca7b962df13090760e034de2061003e428ffa
47d192fd3df9c48cc1e2cc06900bddb9ac9417d0c4c039ab04288876ae449983.json
5bd981e7c2ca18bdef505759d56db73e93fe31fbf20c60c7adfc2ecefecc8a0e
619fc296dd43f00de5c3529dd4e2f33d1221379eee560ff273a7f35042759523
f60ab9307adae1629c41c2bec5e9efa0c105538846bbb31d529b1569f9959f63
manifest.json
repositories
secret-image-unpack.tar
Upon untarring, you'll find some interesting information. Let's explore it step by step:
manifest.json
is a file that describes the image's configuration, tags, and a list of layers in a sequence where f60ab is the base layer and 35c8 is the last layer.
$ cat manifest.json | jq
[
{
"Config": "47d192fd3df9c48cc1e2cc06900bddb9ac9417d0c4c039ab04288876ae449983.json",
"RepoTags": [
"secret:latest"
],
"Layers": [
"f60ab9307adae1629c41c2bec5e9efa0c105538846bbb31d529b1569f9959f63/layer.tar",
"5bd981e7c2ca18bdef505759d56db73e93fe31fbf20c60c7adfc2ecefecc8a0e/layer.tar",
"619fc296dd43f00de5c3529dd4e2f33d1221379eee560ff273a7f35042759523/layer.tar",
"35c8305da895b7eb8329795724dca7b962df13090760e034de2061003e428ffa/layer.tar"
]
}
]
config
file (47d192...json) includes the history of how the image was built and configured.
$ cat 47d192fd3df9c48cc1e2cc06900bddb9ac9417d0c4c039ab04288876ae449983.json | jq '.history'
[
{
"created": "2023-09-28T21:19:27.686110063Z",
"created_by": "/bin/sh -c #(nop) ADD file:756183bba9c7f4593c2b216e98e4208b9163c4c962ea0837ef88bd917609d001 in / "
},
{
"created": "2023-09-28T21:19:27.801479409Z",
"created_by": "/bin/sh -c #(nop) CMD [\"/bin/sh\"]",
"empty_layer": true
},
{
"created": "2023-11-07T12:32:37.493886117+05:30",
"created_by": "RUN /bin/sh -c touch /secret.txt # buildkit",
"comment": "buildkit.dockerfile.v0"
},
{
"created": "2023-11-07T12:32:37.961070242+05:30",
"created_by": "RUN /bin/sh -c echo \"sensitive-data\" > /secret.txt # buildkit",
"comment": "buildkit.dockerfile.v0"
},
{
"created": "2023-11-07T12:32:38.419003725+05:30",
"created_by": "RUN /bin/sh -c rm /secret.txt # buildkit",
"comment": "buildkit.dockerfile.v0"
}
]
As you can see, in this case, the sensitive data is revealed in the step that runs the echo
command.
From the manifest file we can see 619fc is layer where the we put sensitive data in the file.
Let's unpack the tar file and see what contents that layer holds:
$ cd 619fc296dd43f00de5c3529dd4e2f33d1221379eee560ff273a7f35042759523
$ ls
json layer.tar VERSION
$ tar -xf layer.tar
etc json layer.tar secret.txt VERSION
$ cat secret.txt
sensitive-data
Surprisingly, we still have the file stored in the layer even if the file was deleted in the subsequent layer. Anyone can easily obtain that file by unpacking the image, So It's generally a best practice to not include any information that you are not supposed to reveal to the general public.
I hope you have gained insights into how the docker builds and stores the information in a series of layers. It's crucial to ensure that sensitive information is handled securely, outside of the image layers, to mitigate potential security risks.
If you are interested in docker, containers in general, let's connect on Twitter (@narharistwt
) and interactively share the learning.