Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Platforms 0.10 and 0.11 have issues with Multi-Arch Daemon builds. #1457

Open
BarDweller opened this issue Jan 25, 2025 · 0 comments
Open

Platforms 0.10 and 0.11 have issues with Multi-Arch Daemon builds. #1457

BarDweller opened this issue Jan 25, 2025 · 0 comments
Labels
status/triage type/bug Something isn't working

Comments

@BarDweller
Copy link
Contributor

Summary

During a daemon build at platform level 0.10 or 0.11, with a multi-arch run image, the restorer will

  • reach out to the repository of the run image to identify the architecture specific digest, and update the analyzed.toml with the sha reference of the targeted architecture.

This is problematic, as

  1. the run image may be a daemon only image, not even present in the repository it may be tagged for
  2. the rewrite of the run image reference prevents an image pull policy of 'never' being possible, as the run image reference is rewritten based on information from a remote repo. This may lead to an unexpected run image being selected, and prevents a pull policy of never being enough to prevent additional image pulls.
  3. even if the platform has pulled the run image using the target architecture, it will not be present in the daemon under the architecture specific sha (when using docker engine api to pull an image with a platform, where the image is multi-arch, the image ends up in the local daemon with the digest of the combined manifest, not that of the architecture specific manifest). Docker engine offers no api to query the architecutre specific digests for a multi-arch image, making it tough for a platform to be able to pull the expected image required for the export step.
  4. It is unclear for platform 0.10 and 0.11 which builds will need to repull the run image to satisfy the restorer updated digest reference, as the scenarios under which restorer will rewrite the image are not specified. While the behavior has been observed for extension builds using run image switching to a multi-arch run image, it feels likely this affects all daemon builds with multi-arch run images, where restorer updates the run image reference. This should be documented in the platform spec for 0.10 and 0.11, so platform implementers are able to understand when image pulls are required.

Reproduction

Using a platform that pulls the run image after the detect phase has identified the new image during an extension based build using the daemon, with a multi-arch run image target.. note exporter phase will fail because it is unable to locate the arch specific digest referenced run image (currently that error presents itself as a top layer sha issue, as per #1456 )

Expected behavior

From 0.12 onwards, restorer has a -daemon flag that stops this behavior, and uses the id of the run image in the daemon instead. This solves all the challenges above. But does not help when using Builders that request platforms 0.10/0.11.

An ideal fix would be to add the -daemon flag to the 0.10 and 0.11 platforms.. if making changes to a platform level after release is allowed.

If the 0.10/0.11 platform specs cannot be changed now, then resolving this becomes much harder. At 0.10 and 0.11, restorer has no knowledge of if the build is for a daemon or not. Perhaps the analyzed.toml could be updated to carry this flag from the analyzed step.. maybe the run image reference could be supplemented with a run image daemon flag, that restorer could use internally as if the -daemon behavior from 0.12 was requested. (and I guess 0.12 and onwards could also honor that flag if -daemon is not passed explicitly)

Either way, it's going to be a behavioral change for 0.10 and 0.11, but arguably the current behavior for multi arch daemon builds at 0.10 and 0.11 is broken, so maybe the change is acceptable.


Context

lifecycle version

tested with 0.20.5, likely exists in every version since multi-arch was added.

platform version(s)

0.10 , 0.11

anything else?

only affects daemon builds with multi-arch run images

@BarDweller BarDweller added status/triage type/bug Something isn't working labels Jan 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/triage type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant