Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: add multiarch testing #1334

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

ci: add multiarch testing #1334

wants to merge 2 commits into from

Conversation

ttshivers
Copy link
Member

@ttshivers ttshivers commented Sep 23, 2020

Adds multiarch testing on the github actions.

Some tests do fail, because they are failing in production too, like the alpine3.12 builds and the nodejs 12-14 i386 builds. I know i386 isn't officially supported anymore in those versions, so perhaps allows those failures is in order?

In writing the Github actions, I realized that all the versions and variants could be written into one big testing matrix (since they can be dynamic) in one file if that's desirable rather than having almost duplicated files for each version x variant. I could also write up a proof of concept PR for what that would be like.

@ttshivers
Copy link
Member Author

There seems to be some false build failures with buster/buster-slim only on arm32v7 with some ssl cert errors like:

#7 99.25 + curl -fsSLO --compressed https://nodejs.org/dist/v14.11.0/node-v14.11.0-linux-armv7l.tar.xz
#7 99.49 curl: (60) SSL certificate problem: unable to get local issuer certificate
#7 99.50 More details here: https://curl.haxx.se/docs/sslcerts.html
#7 99.50 
#7 99.50 curl failed to verify the legitimacy of the server and therefore could not
#7 99.50 establish a secure connection to it. To learn more about this situation and
#7 99.50 how to fix it, please visit the web page mentioned above.

I am unsure why they are failing here but not in production but will investigate.

Copy link
Member

@nschonni nschonni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just restarting the failing jobs now.
I think the approach is interesting, but I do have a concern on resource exhaustion on the 100+ jobs. I'm not sure if the 20 job limit is per repo or per org

@@ -216,10 +219,31 @@ jobs:
build:
name: ${version} on ${variant}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix: \${{fromJson('{\"arch\":${arch_json}}')}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally prefer the list style because it's easier to see diffs when items are removed. EX:

matrix:
  arch:
    - amd64
    - armv6
    ...

Not a blocker, and may be painful to do in bash

Copy link
Member Author

@ttshivers ttshivers Sep 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha yeah, the pain is why I did it that way originally but I see your point about it being easier to see diffs. I'll convert it to that style

run: sudo apt-get install bats

- name: Set up QEMU
uses: docker/setup-qemu-action@v1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth skipping these steps for amd64? EX: if: matrix.arch != 'amd64'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipping it for amd64 would work just fine. The step usually takes around 5 seconds anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wasn't sure which of the other step below were needed for the non-QEMU arch builds

@ttshivers
Copy link
Member Author

Thinking about how to make this even cleaner:

One possibility is to change the update.sh to generate a single action file that has one big matrix like:

{
  "include": [
    { "version": "10",  "variant": "alpine3.11", "arch": "amd64" },
    { "version": "10",  "variant": "alpine3.11", "arch": "i386" },
  ]
}

It's also possible to replace all the generated test actions with a single file that automatically detects the versions, variants, and arches supported and generates a dynamic matrix at runtime so it doesn't have to be hardcoded and updated manually when any of those change.

@ttshivers
Copy link
Member Author

ttshivers commented Sep 24, 2020

Just restarting the failing jobs now.
I think the approach is interesting, but I do have a concern on resource exhaustion on the 100+ jobs. I'm not sure if the 20 job limit is per repo or per org

I am unsure of how those limits apply too. I can't find any documentation easily that clarifies this, so I'll reach out to GitHub and ask them about how those limits apply.

Edit:
Found a forum post stating "You can execute up to 20 workflows concurrently per repository"

@nschonni
Copy link
Member

One possibility is to change the update.sh to generate a single action file that has one big matrix like:

I originally split them over to separate files so that jobs only get triggered when a particular job has changed. If you think that could be dynamic that might be interesting though

@ttshivers
Copy link
Member Author

ttshivers commented Sep 24, 2020

One possibility is to change the update.sh to generate a single action file that has one big matrix like:

I originally split them over to separate files so that jobs only get triggered when a particular job has changed. If you think that could be dynamic that might be interesting though

I think I could make it work that way by manually examining the list of changed files in the metadata and building the matrix dynamically based on what images changed. If no relevant files were changed, it would just not populate the matrix with any entries. I'd likely write up this dynamic runtime matrix generation logic in some nodejs script since it's easy to run and integrate with GitHub actions and bash is quite annoying. I think this option seems promising and not too difficult to do, especially since part of the logic is already done in the existing bash scripts.

Another option or combination is to take advantage of github action caching and docker caching which should use layers from previous runs that haven't changed.

@tianon
Copy link
Contributor

tianon commented Sep 24, 2020

You could also consider using Travis just for multi-arch testing (https://docs.travis-ci.com/user/multi-cpu-architectures/), which would allow the builds to be native instead of emulated.

@ttshivers
Copy link
Member Author

You could also consider using Travis just for multi-arch testing (https://docs.travis-ci.com/user/multi-cpu-architectures/), which would allow the builds to be native instead of emulated.

Natively building would certainly be much faster. It supports a few architectures but is missing: arm32v6, arm32v7, i386.

@tianon
Copy link
Contributor

tianon commented Sep 25, 2020 via email

@ttshivers
Copy link
Member Author

Both those arm32 arches can build on the graviton instances, and i386 can build on any amd64.

Ah ok. That's certainly a valid option then. One note is that testing was moved from Travis CI in #1194

platform="arm/v6"
;;
arm32v7)
platform="arm/v7"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is right, it you're looking to match with the value in https://nodejs.org/dist/v10.22.1/

Copy link
Member Author

@ttshivers ttshivers Sep 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to convert the arches in the architectures file into an arch that docker supports.

Like, for example, the 12-alpine3.10 image has a manifest with the arches linux/386, linux/amd64, linux/arm/v6, linux/arm64/v8, linux/ppc64le, linux/s390x.

In 12/architectures, it lists lists the arches for alpine3.10 as: amd64, arm32v6, arm32v7, arm64v8, i386, ppc64le, s390x.

I'm just attempting to translate the the architectures files into the arches that docker recognizes in order to build them. So, my current translation looks like this:

archfile	docker
i386		386
amd64		amd64
arm32v6		arm/v6
arm32v7		arm/v7
arm64v8		arm64/v8
ppc64le		ppc64le
s390x		s390x

It looks like bashbrew does this too.

Am I on the right track here?

@nschonni
Copy link
Member

Now that the generated build matrix landed, does this get easier?

@ttshivers
Copy link
Member Author

Now that the generated build matrix landed, does this get easier?

Yeah. I think it will be significantly simplified. Also, the logic I've done in #1341 will help with architecture parsing too.

Base automatically changed from master to main March 15, 2021 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants