Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-Architecture Builds Succeed But Fail to Push #1709

Closed
ichasepucks opened this issue Sep 27, 2023 · 34 comments · Fixed by #1751
Closed

Multi-Architecture Builds Succeed But Fail to Push #1709

ichasepucks opened this issue Sep 27, 2023 · 34 comments · Fixed by #1751
Assignees

Comments

@ichasepucks
Copy link

Description

I'm able to run the build goal successfully, which is able to auth and pull down images.

[INFO] --- docker:0.43.4:build (build-docker-image) @ application ---
[INFO] DOCKER> GET unix://127.0.0.1:1/version
[INFO] Building tar: /Users/toconnor/Development/application/target/docker/application-img/tmp/docker-build.tar
[INFO] DOCKER> [application-img:latest] "application": Created docker-build.tar in 570 milliseconds
[INFO] DOCKER> Credentials helper reply for "docker-credential-osxkeychain" is docker-credential-osxkeychain (github.com/docker/docker-credential-helpers) v0.7.0
[INFO] DOCKER> Using Docker CLI 24.0.6
[INFO] DOCKER> docker --config /Users/toconnor/Development/application/target/docker/application-img/docker buildx create --driver docker-container --name multiarch
[INFO] DOCKER> multiarch
[INFO] Expanding: /Users/toconnor/Development/application/target/docker/application-img/tmp/docker-build.tar into /Users/toconnor/Development/application/target/docker/application-img/tmp/docker-build
[INFO] DOCKER> docker --config /Users/toconnor/Development/application/target/docker/application-img/docker buildx build --progress=plain --builder multiarch --platform linux/arm64 --tag application-img:latest --file=/Users/toconnor/Development/application/target/docker/application-img/tmp/docker-build/Dockerfile /Users/toconnor/Development/application/target/docker/application-img/tmp/docker-build --load
[INFO] DOCKER> #0 building with "multiarch" instance using docker-container driver
[INFO] DOCKER>
[INFO] DOCKER> #1 [internal] booting buildkit
[INFO] DOCKER> #1 pulling image moby/buildkit:buildx-stable-1
[INFO] DOCKER> #1 pulling image moby/buildkit:buildx-stable-1 1.4s done
[INFO] DOCKER> #1 creating container buildx_buildkit_multiarch0
[INFO] DOCKER> #1 creating container buildx_buildkit_multiarch0 0.6s done
[INFO] DOCKER> #1 DONE 2.0s
[INFO] DOCKER>
[INFO] DOCKER> #2 [internal] load build definition from Dockerfile
[INFO] DOCKER> #2 transferring dockerfile: 618B done
[INFO] DOCKER> #2 DONE 0.0s
[INFO] DOCKER>
[INFO] DOCKER> #3 [internal] load metadata for docker-repo/asr/java_v17_oracle_build_debug:3
[INFO] DOCKER> #3 ...
[INFO] DOCKER>
[INFO] DOCKER> #4 [auth] asr/java_v17_oracle_build_debug:pull token for docker-repo
[INFO] DOCKER> #4 DONE 0.0s
[INFO] DOCKER>
[INFO] DOCKER> #5 [auth] asr/java_v17_oracle_runtime:pull token for docker-repo
[INFO] DOCKER> #5 DONE 0.0s
[INFO] DOCKER>
[INFO] DOCKER> #6 [internal] load metadata for docker-repo/asr/java_v17_oracle_runtime:2.8.0
[INFO] DOCKER> #6 DONE 1.4s
[INFO] DOCKER>
[INFO] DOCKER> #3 [internal] load metadata for docker-repo/asr/java_v17_oracle_build_debug:3
[INFO] DOCKER> #3 DONE 1.5s
[INFO] DOCKER>
[INFO] DOCKER> #7 [internal] load .dockerignore
[INFO] DOCKER> #7 transferring context: 2B done
[INFO] DOCKER> #7 DONE 0.0s
[INFO] DOCKER>

... 

[INFO] DOCKER>
[INFO] DOCKER> #16 exporting to docker image format
[INFO] DOCKER> #16 exporting layers
[INFO] DOCKER> #16 exporting layers 2.2s done
[INFO] DOCKER> #16 exporting manifest sha256:8e266ee3684ab9ccac16475ed59998e358e2af6f506376bd9618c9542e376d7d 0.0s done
[INFO] DOCKER> #16 exporting config sha256:e8c5bb4c156e81f0d0b110cb66fc1eca0864974eee9a3f8f2e03b63952414a3e
[INFO] DOCKER> #16 exporting config sha256:e8c5bb4c156e81f0d0b110cb66fc1eca0864974eee9a3f8f2e03b63952414a3e 0.0s done
[INFO] DOCKER> #16 sending tarball
[INFO] DOCKER> #16 ...
[INFO] DOCKER>
[INFO] DOCKER> #17 importing to docker
[INFO] DOCKER> #17 DONE 0.7s
[INFO] DOCKER>
[INFO] DOCKER> #16 exporting to docker image format
[INFO] DOCKER> #16 sending tarball 3.6s done
[INFO] DOCKER> #16 DONE 5.8s

However, the push goal fails with a 403.

[INFO] --- docker:0.43.4:push (push-docker-image) @ application ---
[INFO] DOCKER> GET unix://127.0.0.1:1/version
[INFO] DOCKER> Using Docker CLI 24.0.6
[INFO] Expanding: /Users/toconnor/Development/application/target/docker/application-img/tmp/docker-build.tar into /Users/toconnor/Development/application/target/docker/application-img/tmp/docker-build
[INFO] DOCKER> docker --config /Users/toconnor/Development/application/target/docker/application-img/docker buildx build --progress=plain --builder multiarch --platform linux/amd64,linux/arm64 --tag remote-repo/application-img:latest --file=/Users/toconnor/Development/application/target/docker/application-img/tmp/docker-build/Dockerfile /Users/toconnor/Development/application/target/docker/application-img/tmp/docker-build --push
[INFO] DOCKER> #0 building with "multiarch" instance using docker-container driver
[INFO] DOCKER>
[INFO] DOCKER> #1 [internal] load build definition from Dockerfile
[INFO] DOCKER> #1 transferring dockerfile: 618B done
[INFO] DOCKER> #1 DONE 0.0s
[INFO] DOCKER>
[INFO] DOCKER> #2 [linux/arm64 internal] load metadata for docker-repo/asr/java_v17_oracle_build_debug:3
[INFO] DOCKER> #2 ERROR: unexpected status from HEAD request to https://docker-repo/v2/asr/java_v17_oracle_build_debug/manifests/3: 403 Forbidden
[INFO] DOCKER>
[INFO] DOCKER> #3 [linux/amd64 internal] load metadata for docker-repo/asr/java_v17_oracle_build_debug:3
[INFO] DOCKER> #3 ERROR: unexpected status from HEAD request to https://docker-repo/v2/asr/java_v17_oracle_build_debug/manifests/3: 403 Forbidden
[INFO] DOCKER>
[INFO] DOCKER> #4 [linux/amd64 internal] load metadata for docker-repo/asr/java_v17_oracle_runtime:2.8.0
[INFO] DOCKER> #4 CANCELED
[INFO] DOCKER>
[INFO] DOCKER> #5 [linux/arm64 internal] load metadata for docker-repo/asr/java_v17_oracle_runtime:2.8.0
[INFO] DOCKER> #5 CANCELED
[INFO] DOCKER> ------
[INFO] DOCKER>  > [linux/amd64 internal] load metadata for docker-repo/asr/java_v17_oracle_build_debug:3:
[INFO] DOCKER> ------
[INFO] DOCKER> ------
[INFO] DOCKER>  > [linux/arm64 internal] load metadata for docker-repo/asr/java_v17_oracle_build_debug:3:
[INFO] DOCKER> ------
[INFO] DOCKER> Dockerfile:5
[INFO] DOCKER> --------------------
[INFO] DOCKER>    3 |     ARG jacocoVersion
[INFO] DOCKER>    4 |
[INFO] DOCKER>    5 | >>> FROM ${buildImage} as builder
[INFO] DOCKER>    6 |     ARG jacocoVersion
[INFO] DOCKER> --------------------
[INFO] DOCKER> ERROR: failed to solve: docker-repo/asr/java_v17_oracle_build_debug:3: unexpected status from HEAD request to https://docker-repo/v2/asr/java_v17_oracle_build_debug/manifests/3: 403 Forbidden
[ERROR] DOCKER> Error status (1) when building

To make things even more confusing, if I copy the command and execute it locally in the shell the push command works as expected. Specifically this command which was copied verbatim from the debug output from DMP.

docker --config /Users/toconnor/Development/application/target/docker/application-img/docker buildx build --progress=plain --builder multiarch --platform linux/amd64,linux/arm64 --tag remote-repo/application-img:latest --file=/Users/toconnor/Development/application/target/docker/application-img/tmp/docker-build/Dockerfile /Users/toconnor/Development/application/target/docker/application-img/tmp/docker-build --push

I've tried every authentication mechanism specified in the documentation and they all result in the same behavior.

Both the source registry and destination registry are internally hosted on Artifactory and are v2 Docker registries. They require auth for all requests. The fact that I'm able to build successfully proves that I have authentication configured correctly. On the surface it appears that the HEAD request may not be sending authentication credentials but all the other requests do?

I'm running on an M1 Macbook with the latest Docker Desktop. As far as I can tell, this seems to be an issue only on Mac or with recent versions of Docker. This actually works from our Linux AMD64 based CI infrastructure. However, that infra is using Docker 20.x.x. I know from other issues reported that there is some variation in behavior between older Docker clients and newer.

At this point, I'm not sure how to debug this any further. I'm open to debug builds or even some hints on where to dig into the code. I spent some time looking at it but there was nothing obvious.

Info

  • docker-maven-plugin version : 0.43.4
  • Maven version (mvn -v) : 3.9.2

  • Docker version :
Client:
 Cloud integration: v1.0.35+desktop.4
 Version:           24.0.6
 API version:       1.43
 Go version:        go1.20.7
 Git commit:        ed223bc
 Built:             Mon Sep  4 12:28:49 2023
 OS/Arch:           darwin/arm64
 Context:           default

Server: Docker Desktop 4.23.0 (120376)
 Engine:
  Version:          24.0.6
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.7
  Git commit:       1a79695
  Built:            Mon Sep  4 12:31:36 2023
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.22
  GitCommit:        8165feabfdfe38c65b599c4993d227328c231fca
 runc:
  Version:          1.1.8
  GitCommit:        v1.1.8-0-g82f18fe
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
  • If it's a bug, how to reproduce :
@rohanKanojia
Copy link
Member

@ichasepucks : Thanks for reporting, I'll try to reproduce your issue via some GitHub action mac workflow

@ichasepucks
Copy link
Author

FWIW, I'm trying to have some colleagues reproduce this. So far, one with an Intel Mac and an old version of Docker, 20.10.12 and Maven 3.8.2 has this working.

@boris-prochazka-nentgroup
Copy link

boris-prochazka-nentgroup commented Oct 6, 2023

If you look in your build listing you see buildx build --progress=plain --builder multiarch --platform linux/arm64
then later in the push listing you have buildx build --progress=plain --builder multiarch --platform linux/amd64,linux/arm64

I think plugin builds it for a single platform "linux/arm64" and later on it try to push as a multi platform image.

When I look in the plugin sources BuildXService:134 buildAndLoadSinglePlatform() there is only support for single platform at a time, while BuildXService:147 pushMultiPlatform() performs multiplatform push.

I actualy tryied to do a multiplatform build with the maven-plugin with a number of different configuration all accoding to the documentation but never suceeded and I connsider this a bug or a not yet supported fetaure.

@boris-prochazka-nentgroup
Copy link

boris-prochazka-nentgroup commented Oct 9, 2023

What is needed here i to rewrite the BuildXService:134 buildAndLoadSinglePlatform() to :

    protected void buildAndLoadPlatforms(
            List<String> buildX, String builderName, BuildDirs buildDirs,
            ImageConfiguration imageConfig, String configuredRegistry, File buildArchive
    )  throws MojoExecutionException {
        List<String> platforms = imageConfig.getBuildConfiguration().getBuildX().getPlatforms();

        // "load" of multiplatform builds into the local image repo is not supported by docker v24.0.6
        // single non native platform can be loaded and runed in dockers QEMU emulator
        buildX(buildX, builderName, buildDirs, imageConfig, configuredRegistry,
                    platforms, buildArchive, platforms.size()==1 ? "--load" : null);
    }

This was the easy part, then there are a number of test-cases that needs to be modified and adopted to the fact that multiplatform builds can be done and single platform builds don't need to be native.

@boris-prochazka-nentgroup

After some iterations I finally understood how the plugin works and have åcome to a conclusion that that is works fine as it is, and reasons for it. The basic reason is that multiplatform images cant be loaded in your local docker repo you have on your machine. It only have support for single platform but it can actualy be a foreign platform. Later on when you run deploy it will rerun buildx ... -push that builds the multiplatform image and pushes it upstreams.

This all works fine for me, so disregards all my previous comments.

@ichasepucks
Copy link
Author

ichasepucks commented Jan 22, 2024

Hi @rohanKanojia this was really bugging me again so I built the plugin locally and started trying some things. I've narrowed this down to using the configuration. https://github.com/fabric8io/docker-maven-plugin/blob/master/src/main/java/io/fabric8/maven/docker/service/BuildXService.java#L76-L79

However, I don't know why this is causing the problem. I'm not sure what you'd recommend to resolve this? An option to ignore config could work but probably should understand why it's causing the failure more?

Edit:
I was reading more on https://docs.docker.com/engine/reference/commandline/cli/#configuration-files. This makes sense why it's not able to authenticate correctly since it's losing all of my auth settings in my config.json in my home dir. This works on my remote agents due to an older docker version. Why is this being set during the buildx push command anyway?

@ichasepucks
Copy link
Author

@rohanKanojia I would be willing to submit a PR for this if you could provide the necessary background info and some guidance on a potential approach.

@rohanKanojia
Copy link
Member

some guidance on a potential approach.

@ichasepucks : Do you have something in mind that you would like to propose to fix this? I will try my best to help you. Problem is that I don't have mac machine, I'm not sure I would be able to reproduce the issue. Do you have some minimal reproducer project that I can build to reproduce the problem?

@ichasepucks
Copy link
Author

@rohanKanojia thanks for responding. I don't currently have a simple generic repro but I may be able to create one. I think my first question is why is the config added by default on new versions of Docker, here? One solution is to add a configuration option that only adds this when present. By default, I would expect most users to have their config directory already set up and configured how they need? Just trying to make sure I'm not missing an obvious reason for this bit of code.

@ichasepucks
Copy link
Author

ichasepucks commented Jan 31, 2024

Actually looks like this is a duplicate of #1701. I verified this works with version 0.43.3. This was broken in https://github.com/fabric8io/docker-maven-plugin/pull/1703/files.

@rohanKanojia
Copy link
Member

rohanKanojia commented Feb 3, 2024

@ichasepucks : I see. let me look into this.

@rohanKanojia rohanKanojia self-assigned this Feb 3, 2024
@rohanKanojia
Copy link
Member

@ichasepucks : Could you please share the location where you've installed docker-buildx? Is it in $HOME/.docker/cli-plugins ? What happens if you install the plugin in some other folder? I'm suspecting whether this is related to docker/for-mac#6928

docker --config path/to/config buildx ... seems to work okay on Linux and Windows but it fails on Mac OS.

rohanKanojia added a commit to rohanKanojia/docker-maven-plugin that referenced this issue Feb 4, 2024
Related to fabric8io#1709

Currently Docker CLI on Mac OS don't seem to respect `--config` flag.
When DMP tries to override default Docker config directory by providing
`--config` flag, Docker CLI is no longer able to recognize buildx
options.

This seems to happening for scenarios where docker-buildx is installed
in `~/.docker/cli-plugins`, whenever `docker --config new/path/config`
is provided docker CLI uses new config path (which does not contain
buildx).

Add a workaround to copy `docker-buildx` binary to temporary config
directory created for docker buildx build. This seems to make docker
recognize buildx even after config override.

Signed-off-by: Rohan Kumar <[email protected]>
@rohanKanojia
Copy link
Member

@ichasepucks : I think this comment is very accurate description of the issue you're facing docker/for-mac#6928 (comment)

I believe --config defaults to ~/.docker, so that buildx is loaded from ~/.docker/cli-plugins.

If you set --config=/tmp/docker, it will stop looking in ~/.docker/cli-plugins, and fall back to looking in the system directories for buildx.

From your docker info, I'm guessing you only have buildx installed in ~/.docker/cli-plugins, so setting this flag breaks your env.

I think we can add a workaround in DMP code to copy docker-buildx binary to config directory so that Mac OS docker CLI is able to resolve buildx when --config is overridden. I tested this approach with GitHub Actions Mac workflow and I was able to build and push to a ghcr.io registry.

I have pushed code related to above mentioned approach in this branch https://github.com/rohanKanojia/docker-maven-plugin/tree/pr/buildx-push-fails-mac .

Could you please try out this branch and confirm if it is working for your environment?

@ichasepucks
Copy link
Author

Hi @rohanKanojia this still doesn't work. The problem is not resolving buildx. The problem is that we should not be setting a docker --config option by default. This is only a problem when you need to access images that require authentication and we've changed from a default docker config that has been previously authenticated to a blank one that has not. There is no reason to send a --config option by default to docker. Why are we doing this?

@rohanKanojia
Copy link
Member

rohanKanojia commented Feb 5, 2024

@ichasepucks :

this still doesn't work.

Could you please elaborate what error you faced while running it? I had tested it on MacOS via GitHub action macos-13 runner.

Docker Maven Plugin supports reading registry credentials from maven settings.xml, properties apart from default docker config. We need to set --config flag to send config information for these scenarios.

I think we can try to set --config only for push scenarios.

@ichasepucks
Copy link
Author

ichasepucks commented Feb 5, 2024

@rohanKanojia the error is the same as originally reported. I have server entries for the repositories I need authentication to but I still get a 403 Forbidden from the docker registry. Let me see if I can put some extra logging in to see if this is actually trying to use the auth configuration passed. It appears not to be in this context.

Edit: I see [INFO] DOCKER> Credentials helper reply for "docker-credential-osxkeychain" is docker-credential-osxkeychain (github.com/docker/docker-credential-helpers) v0.7.0. Does this indicate it's not using the server settings in settings.xml? I don't see any AuthConfig logs at all in my output.

@rohanKanojia
Copy link
Member

@ichasepucks : Could you please set a breakpoint on this line and see what value is being picked inside AuthConfig?

if (isDockerCLINotLegacy() || shouldAddConfigInLegacyDockerCLI(authConfig, configuredRegistry)) {

Run mvnDebug docker:push from terminal and connect via Remote JVM Debug from your IDE.

@ichasepucks
Copy link
Author

@rohanKanojia debugged this a bit more and I think I see what's going on now. The AuthConfig object is non null and contains the configuredRepository which is the same as I defined in the pushRegistry configuration. The config.json is being written with these auth credentials. The problem however, is that the base FROM image is hosted on a registry that needs READ auth also. This other registry is actually what is failing on the FROM attempt to pull the base image. So this isn't actually about pushing at all.

Is there a way I can get more than 1 registry in the AuthConfig object?

@rohanKanojia
Copy link
Member

@ichasepucks : I think we already handle case of fetching auth config from base images. Could you please set a breakpoint here and see what's happening here?

Set<String> fromRegistries = getRegistriesForPull(buildConfig);
for (String fromRegistry : fromRegistries) {
if (StringUtils.isNotBlank(configuredRegistry) && configuredRegistry.equalsIgnoreCase(fromRegistry)) {
continue;
}
registryConfig = getRegistryConfig(fromRegistry);

@ichasepucks
Copy link
Author

@rohanKanojia BuildMojo doesn't appear to be used during the buildx push operation. When I added logging to the BuildMojo code I can see the source registry is created as expected in the config.json but since this code path doesn't execute in the buildx push path the authconfig doesn't contain the source registry credentials. I think this is the source of the issue. Does this make sense?

@rohanKanojia
Copy link
Member

@ichasepucks : Are you able to get issue fixed if you copy paste this code in RegistryService :

AuthConfig authConfig = createAuthConfig(true, imageName.getUser(), configuredRegistry, registryConfig);

@ichasepucks
Copy link
Author

@rohanKanojia sorry I didn't follow the request. Can you tell me where you want that line copied to or clarify what you want me to try?

@rohanKanojia
Copy link
Member

sorry I didn't follow the request. Can you tell me where you want that line copied to or clarify what you want me to try?

@ichasepucks : I just tried making modifications in RegistryService. I want you to try out this branch (I haven't tested it myself though since I don't have same environment as you) .

@ichasepucks
Copy link
Author

@rohanKanojia you got it! This is working now!

@rohanKanojia
Copy link
Member

@ichasepucks : Would you like to submit a PR to fix this issue?

@ichasepucks
Copy link
Author

I'm not sure what I would do different from your code changes? Happy to submit your changes from my own fork if that is helpful to you.

@rohanKanojia
Copy link
Member

@ichasepucks : I just did a spike. I think we would also need to adapt existing tests + write new tests

@ichasepucks
Copy link
Author

@rohanKanojia I see. I could possibly do that in the next week or 2 if you don't have time to add them. Thanks.

@rohanKanojia
Copy link
Member

@ichasepucks : oh okay, I will try to complete it in coming weekend.

@rohanKanojia
Copy link
Member

@ichasepucks : I've merged #1751

Could you please give a try to Docker Maven Plugin 0.44-SNASHOT and provide feedback whether it's working as expected for your use case?

@ichasepucks
Copy link
Author

@rohanKanojia I can confirm that 0.44-SNAPSHOT is working as expected for me. Thanks!

@rohanKanojia
Copy link
Member

@ichasepucks : Thanks a lot, I'll ask some other users as well and will cut a release next week.

@rohanKanojia
Copy link
Member

@ichasepucks : I've released 0.44.0 that contains fix for your issue. Would appreciate if you could try it out and provide feedback.

@ichasepucks
Copy link
Author

@rohanKanojia 0.44.0 works for me. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants