Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enforce wheel size limits, README formatting in CI #114

Closed
wants to merge 7 commits into from

Conversation

jameslamb
Copy link
Member

Description

Contributes to rapidsai/build-planning#110

Proposes adding 2 types of validation on wheels in CI, to ensure we continue to produce wheels that are suitable for PyPI.

@jameslamb jameslamb added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Nov 13, 2024
Comment on lines 8 to 12
python -m pip install \
'pydistcheck==0.8.*' \
twine

pyenv rehash
Copy link
Member Author

@jameslamb jameslamb Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first, I was seeing CI failures like this in this PR:

./ci/validate_wheel.sh: line 11: pydistcheck: command not found

(build link)

Think I see the root cause.

pydistcheck is available from the rapidsai/ci-wheel Docker image: rapidsai/ci-imgs#205

This project does all of its CUDA 12 builds using CUDA 12.0.1.

export BUILD_MATRIX="
- { CUDA_VER: '12.0.1', ARCH: 'amd64', PY_VER: '3.10', LINUX_VER: 'rockylinux8' }
- { CUDA_VER: '12.0.1', ARCH: 'amd64', PY_VER: '3.11', LINUX_VER: 'rockylinux8' }
- { CUDA_VER: '12.0.1', ARCH: 'amd64', PY_VER: '3.12', LINUX_VER: 'rockylinux8' }
- { CUDA_VER: '12.0.1', ARCH: 'arm64', PY_VER: '3.10', LINUX_VER: 'rockylinux8' }
- { CUDA_VER: '12.0.1', ARCH: 'arm64', PY_VER: '3.11', LINUX_VER: 'rockylinux8' }
- { CUDA_VER: '12.0.1', ARCH: 'arm64', PY_VER: '3.12', LINUX_VER: 'rockylinux8' }
"

And therefore ends up pulling rapidsai/ci-wheel:cuda12.0.1-rockylinux8-py3.12 (build link).

We stopped building CUDA 12.0.1 ci-wheel images in rapidsai/ci-imgs#194... and so no new images with that tag was published with pydistcheck in it. The latest rapidsai/ci-wheel:cuda12.0.1-rockylinux8-py3.12 was pushed about a month ago: https://hub.docker.com/r/rapidsai/ci-wheel/tags?name=cuda12.0.1-rockylinux8-py3.12

Screenshot 2024-11-13 at 6 41 44 PM

Assuming pynvjitlink needs to keep building on CUDA 12.0.1 for compatibility reasons, options I see:

For the purpose of these validation checks, just installing it at runtime should be fine... they're small libraries with few dependencies, that are quick to install.

But having this repo's CI rely on pulling images that are no longer getting updated risks it missing out on other updates (like security patches to libraries in the base image), and on the need to do more special-casing of this repo to use other things newly installed in ci-wheel images.

@bdice @KyleFromNVIDIA what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure if it’s important for this repo to have size checks. The lowest friction thing is probably to update the CI image tags. I don’t think we’ll have issues from that, please give it a try.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's helpful to have size checks for anything we're publishing to PyPI, even if we feel it's unlikely they'll grow much... they're cheap maintenance-wise, and help avoid surprises at release time.

PR to add back the 12.0.1 images: rapidsai/ci-imgs#206

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I meant we could use something newer than 12.0.1 in this repo, not add back the 12.0.1 images.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohhhhhh I'm sorry, misunderstood what "update the CI image tags" meant! Sure, let me try that here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright I tried updating the matrix to "all builds on CUDA 12.5.1, tests on a mix of CUDA 12.0.1 and 12.5.1". Looks like the 2 test jobs on CUDA 12.5.1 are failing:

FAIL: test_atomic_add_double_global_3 (numba.cuda.tests.cudapy.test_atomics.TestCudaAtomics.test_atomic_add_double_global_3)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/conda/envs/test/lib/python3.11/site-packages/numba/cuda/tests/cudapy/test_atomics.py", line 669, in test_atomic_add_double_global_3
    self.assertCorrectFloat64Atomics(cuda_func, shared=False)
  File "/opt/conda/envs/test/lib/python3.11/site-packages/numba/cuda/tests/cudapy/test_atomics.py", line 572, in assertCorrectFloat64Atomics
    self.assertIn(f'{inst}.add.f64', asm)
AssertionError: 'red.add.f64' not found in '***BIG STRING***'

(build link)

@bdice could you help me with this? I'm not familiar with the compatibility story for this library, so not sure if failures like that are expected.

I hope we can get wheel size checks running here, but if this would take a lot of effort to debug and if you also want to avoid having a pip install twine pydistcheck on every wheel-building job (for reasons like rapidsai/shared-actions#22 (comment)), then I'd be fine with skipping this repo for the purpose of this release.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I have a good idea of where to start debugging that. @gmarkall may have to help. I think I would prefer to just skip this repository, I don't think we gain much from adding the size checks right now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright yeah let's skip it, this seems like a good place to cut scope in favor of all the other things we're trying to land in this release.

Let's close this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put up an issue about it: #116

I still think it'd be good to return to at some point.

@jameslamb jameslamb marked this pull request as ready for review November 14, 2024 00:51
@jameslamb jameslamb requested review from a team as code owners November 14, 2024 00:51
@jameslamb jameslamb requested a review from bdice November 14, 2024 00:51
@jameslamb jameslamb changed the title WIP: [DO NOT MERGE] enforce wheel size limits, README formatting in CI enforce wheel size limits, README formatting in CI Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants