-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docker changes the image shasum while saving it #5515
Comments
I think this may be a limitation of the "graphdriver" image store in the docker engine. The graphdriver store was designed to be optimnized for local disk consumption. As part of that, images pulled from a registry are extracted after they are pulled, after which the compressed layers are discarded, and only the extracted layers, as well as information about the pulled layers are preserved. When saving pushing an image to the same registry, these layers, as well as the related "image manifests" are reconstructed, but this part is not reproducible (due to both compression artifacts as well as timestamps included in image manifest metadata). That said; I tried to see what differences are between the saved files, and ... honestly, couldn't immediately find any; possible reasons could be the order in which files are included in the tar header, but they seem to be identical in every other way; docker pull alpine
Using default tag: latest
latest: Pulling from library/alpine
Digest: sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d
Status: Downloaded newer image for alpine:latest
docker.io/library/alpine:latest
docker image save -o one.tar
docker image save -o two.tar
shasum one.tar
05ee3ff4ae600438a025ab12339395bdc94dfa85 one.tar
shasum two.tar
1b74f13ee5f67bc8345d0d4cd1e70119c3990feb two.tar
tar --xattrs -tvf one.tar
drwxr-xr-x 0/0 0 2024-09-06 22:20 blobs/
drwxr-xr-x 0/0 0 2024-10-08 11:48 blobs/sha256/
-rw-r--r-- 0/0 401 1970-01-01 00:00 blobs/sha256/309ff318b44b4f2af442a37a269a93ce6907d277d2c168d3160f36cc802f8838
-rw-r--r-- 0/0 8081920 2024-09-06 22:20 blobs/sha256/63ca1fbb43ae5034640e5e6cb3e083e05c290072c5366fcaa9d62435a4cced85
-rw-r--r-- 0/0 1143 2024-09-06 22:20 blobs/sha256/6ad8fd5c38430e1ab05f033c689994934a216c1a7481aeb44de1239d7ca82f77
-rw-r--r-- 0/0 1471 2024-09-06 22:20 blobs/sha256/91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b
-rw-r--r-- 0/0 362 2024-10-08 11:48 index.json
-rw-r--r-- 0/0 457 1970-01-01 00:00 manifest.json
-rw-r--r-- 0/0 31 1970-01-01 00:00 oci-layout
-rw-r--r-- 0/0 89 1970-01-01 00:00 repositories
tar --xattrs -tvf two.tar
drwxr-xr-x 0/0 0 2024-09-06 22:20 blobs/
drwxr-xr-x 0/0 0 2024-10-08 11:48 blobs/sha256/
-rw-r--r-- 0/0 401 1970-01-01 00:00 blobs/sha256/309ff318b44b4f2af442a37a269a93ce6907d277d2c168d3160f36cc802f8838
-rw-r--r-- 0/0 8081920 2024-09-06 22:20 blobs/sha256/63ca1fbb43ae5034640e5e6cb3e083e05c290072c5366fcaa9d62435a4cced85
-rw-r--r-- 0/0 1143 2024-09-06 22:20 blobs/sha256/6ad8fd5c38430e1ab05f033c689994934a216c1a7481aeb44de1239d7ca82f77
-rw-r--r-- 0/0 1471 2024-09-06 22:20 blobs/sha256/91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b
-rw-r--r-- 0/0 362 2024-10-08 11:48 index.json
-rw-r--r-- 0/0 457 1970-01-01 00:00 manifest.json
-rw-r--r-- 0/0 31 1970-01-01 00:00 oci-layout
-rw-r--r-- 0/0 89 1970-01-01 00:00 repositories I think switching to the containerd image store may help here; when using the containerd image store ("snapshotters"), pulled images, including their compressed layers, are keept, and the exported tar looks to be fully reproducible; docker pull alpine
Using default tag: latest
latest: Pulling from library/alpine
Digest: sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d
Status: Downloaded newer image for alpine:latest
docker.io/library/alpine:latest
docker save -o c8d-one.tar alpine:latest
docker save -o c8d-two.tar alpine:latest
shasum c8d-one.tar
b4d8c4f578be934ad2c0a82f7efd184cf027d27f c8d-one.tar
shasum c8d-two.tar
b4d8c4f578be934ad2c0a82f7efd184cf027d27f c8d-two.tar |
If you have an environment to test on, it's worth switching to the containerd image store (which also provides support for storing multi-arch images);
Be aware though that switching the store switches to a different location for storing images and containers; your existing images won't be deleted, but won't be accessible (but still consume space). If possible, my recommendation is to remove content (containers, images) before switching. |
Thanks for your response what is the default storage driver used in docker |
can i switch containerd configuration to use the same storage driver used by docker |
Docker (without the containers image store) selects the default storage driver based on the underlying filesystem. In most cases that is When using the containerd image store, no detection is done currently, but the default will be the |
I've reproduced the issue: ❯ docker save -o one.tar.gz debian:latest
❯ docker save -o two.tar.gz debian:latest
❯ wc -c one.tar.gz two.tar.gz
143606272 one.tar.gz
143606272 two.tar.gz
287212544 total
❯ shasum one.tar.gz two.tar.gz
d068d04161345aa5693859dbfc6015913fdd8af7 one.tar.gz
930e62e8f0cd9a24af709107f0b199ff87e570be two.tar.gz On first pass, the metadata looks the same: ❯ shasum <(tar tvf one.tar.gz) <(tar tvf two.tar.gz)
cf795d491009e668091a1a13d83b949d00a80073 /dev/fd/14
cf795d491009e668091a1a13d83b949d00a80073 /dev/fd/15 However, if we look at the binary records, there is a clear difference: ❯ diff -ru <(hexdump -C one.tar.gz) <(hexdump -C two.tar.gz)
--- /dev/fd/14 2024-10-08 12:55:38
+++ /dev/fd/15 2024-10-08 12:55:56
@@ -20,7 +20,7 @@
00000260 00 00 00 00 30 30 30 30 37 35 35 00 30 30 30 30 |....0000755.0000|
00000270 30 30 30 00 30 30 30 30 30 30 30 00 30 30 30 30 |000.0000000.0000|
00000280 30 30 30 30 30 30 30 00 31 34 37 30 31 33 30 36 |0000000.14701306|
-00000290 30 31 36 00 30 31 31 33 35 33 00 20 35 00 00 00 |016.011353. 5...|
+00000290 30 32 34 00 30 31 31 33 35 32 00 20 35 00 00 00 |024.011352. 5...|
000002a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000300 00 75 73 74 61 72 00 30 30 00 00 00 00 00 00 00 |.ustar.00.......|
@@ -6857398,7 +6857398,7 @@
088f2e60 00 00 00 00 30 30 30 30 36 34 34 00 30 30 30 30 |....0000644.0000|
088f2e70 30 30 30 00 30 30 30 30 30 30 30 00 30 30 30 30 |000.0000000.0000|
088f2e80 30 30 30 30 35 35 32 00 31 34 37 30 31 33 30 36 |0000552.14701306|
-088f2e90 30 31 36 00 30 31 31 32 34 36 00 20 30 00 00 00 |016.011246. 0...|
+088f2e90 30 32 34 00 30 31 31 32 34 35 00 20 30 00 00 00 |024.011245. 0...|
088f2ea0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
088f2f00 00 75 73 74 61 72 00 30 30 00 00 00 00 00 00 00 |.ustar.00.......| The difference seem to map two the timestamp field and a checksum that follows. In both cases we change from ❯ tar tvf one.tar.gz
drwxr-xr-x 0 0 0 0 Jul 1 17:39 blobs/
drwxr-xr-x 0 0 0 0 Oct 8 12:46 blobs/sha256/
-rw-r--r-- 0 0 0 403 Dec 31 1969 blobs/sha256/c89edf5050f4db4a7ac20a64bdb77f7ddca76dfc2c87a39fddca419084dca080
-rw-r--r-- 0 0 0 143594496 Jul 1 17:39 blobs/sha256/d1660adccd2b42ad0160cba9a291ef75a87223577240a585a7f1cb90676ec3b8
-rw-r--r-- 0 0 0 1152 Jul 1 17:39 blobs/sha256/d5156a0989b7b62fd13b9f28e7e1864554ae6b47657a2efc503b097818653cad
-rw-r--r-- 0 0 0 1477 Jul 1 17:39 blobs/sha256/f753e4d18c7075845e84d759f49d57529f268aa7a262b517fd9f3d62749890eb
-rw-r--r-- 0 0 0 362 Oct 8 12:46 index.json
-rw-r--r-- 0 0 0 459 Dec 31 1969 manifest.json
-rw-r--r-- 0 0 0 31 Dec 31 1969 oci-layout
-rw-r--r-- 0 0 0 89 Dec 31 1969 repositories From here, we can see that Here's some of my info:
Note that I do not have the containerd snapshotter enabled (I should though ;) ), so this is just from the overlay2 graphdriver. I don't remember if this is in graphdriver or not but it likely is. As a matter of course, this really isn't a |
Ok, breaking this down to make the fix easier. We have two bugs:
|
let me once this feature is merged and available |
Description
scenario
docker save -o tarfilename imagename:tagname
during everytime when we try to save the same image docker is modifying the shasum values , instead the sha values should be identical
Reproduce
docker save -o tarfilename imagename:tagname
again try to save the same image with tarfilename1
execute shasum tarfilename
shasum tarfilename1
the sha values will be different
Expected behavior
No response
docker version
docker version 1.24.6
docker info
docker version 1.24.6
Additional Info
No response
The text was updated successfully, but these errors were encountered: