Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

port to rattler-build #1796

Open
wants to merge 54 commits into
base: branch-25.04
Choose a base branch
from
Open

Conversation

gforsyth
Copy link
Contributor

@gforsyth gforsyth commented Jan 27, 2025

Some notes on progress and changes needed:

  • GIT_DESCRIBE_HASH and GIT_DESCRIBE_NUMBER aren't supported, so those
    need to be set in the environment.

    • This is now handled by rapids-configure-rattler in gha-tools
  • For most of our recipes, we'll want to use the cache key that I have set
    up in librmm, otherwise each output is built as a separate recipe so you
    end up compiling everything N times

  • I suspect it's faster to port over recipes by hand, rather than with conda-recipe-manager convert. There's a fair bit of preparation required to make the meta.yaml files compatible in the first place, and failures can be hard to diagnose

    • Need to remove jinja conditionals in favor of minijinja syntax
    • Only some support for ternary operators (no !=)
    • conda-recipe-manager doesn't support generating multi-output recipes at this point (although it can parse them)

xref: rapidsai/build-planning#47

Copy link

copy-pr-bot bot commented Jan 27, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

{% if cuda_major != "11" %}
- cuda-cudart-dev
{% endif %}
- {{ "cuda-cudart-dev" if cuda_major == "12" else "cuda-version" }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conda-recipe-manager doesn't like != as a comparison operator

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could do not and ==s?

Suggested change
- {{ "cuda-cudart-dev" if cuda_major == "12" else "cuda-version" }}
- {{ "cudatoolkit" if not (cuda_major == "12") else "cuda-version" }}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that works!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to write this in a way that only mentions CUDA 11, so that the condition can be trivially deleted when adding future major version support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was the idea. I just goofed on the syntax. Took another go below

#1796 (comment)

Comment on lines 5 to 13
version: ${{ env.get("RAPIDS_PACKAGE_VERSION") }}
cuda_version: ${{ (env.get('RAPIDS_CUDA_VERSION') | split('.'))[:2] | join(".") }}
cuda_major: ${{ (env.get('RAPIDS_CUDA_VERSION') | split('.'))[0] }}
date_string: ${{ env.get("RAPIDS_DATE_STRING") }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I grabbed these from rapidsai/cugraph#4551 as currently conda-recipe-manager doesn't have support for handling converting from the extended jinja2 syntax to the subset that rattler supports

@github-actions github-actions bot added the ci label Jan 29, 2025
@gforsyth gforsyth marked this pull request as ready for review January 29, 2025 19:48
@gforsyth gforsyth requested review from a team as code owners January 29, 2025 19:48
@gforsyth
Copy link
Contributor Author

Ok, I have content diffs for the packages built using these recipes (on my local) and the most recent nightly builds:

Python looks good -- only change is an expected directory name difference:

🐚 colordiff nightly-rmm rattler-rmm
1,8c1,7
< info/licenses/LICENSE
< lib/python3.12/site-packages/rmm-25.4.0a17.dist-info/direct_url.json
< lib/python3.12/site-packages/rmm-25.4.0a17.dist-info/INSTALLER
< lib/python3.12/site-packages/rmm-25.4.0a17.dist-info/licenses/LICENSE
< lib/python3.12/site-packages/rmm-25.4.0a17.dist-info/METADATA
< lib/python3.12/site-packages/rmm-25.4.0a17.dist-info/RECORD
< lib/python3.12/site-packages/rmm-25.4.0a17.dist-info/REQUESTED
< lib/python3.12/site-packages/rmm-25.4.0a17.dist-info/WHEEL
---
> lib/python3.12/site-packages/rmm-25.4.0.dist-info/direct_url.json
> lib/python3.12/site-packages/rmm-25.4.0.dist-info/INSTALLER
> lib/python3.12/site-packages/rmm-25.4.0.dist-info/licenses/LICENSE
> lib/python3.12/site-packages/rmm-25.4.0.dist-info/METADATA
> lib/python3.12/site-packages/rmm-25.4.0.dist-info/RECORD
> lib/python3.12/site-packages/rmm-25.4.0.dist-info/REQUESTED
> lib/python3.12/site-packages/rmm-25.4.0.dist-info/WHEEL

librmm-tests is identical:

🐚 colordiff nightly-librmm-tests rattler-librmm-tests

librmm is slightly different:

🐚 colordiff nightly-librmm rattler-librmm
1,31d0
< include/fmt/args.h
< include/fmt/base.h
< include/fmt/chrono.h
< include/fmt/color.h
< include/fmt/compile.h
< include/fmt/core.h
< include/fmt/format.h
< include/fmt/format-inl.h
< include/fmt/os.h
< include/fmt/ostream.h
< include/fmt/printf.h
< include/fmt/ranges.h
< include/fmt/std.h
< include/fmt/xchar.h
< include/nvtx3/nvToolsExtCuda.h
< include/nvtx3/nvToolsExtCudaRt.h
< include/nvtx3/nvToolsExt.h
< include/nvtx3/nvToolsExtOpenCL.h
< include/nvtx3/nvToolsExtSync.h
< include/nvtx3/nvtx3.hpp
< include/nvtx3/nvtxDetail/nvtxImplCore.h
< include/nvtx3/nvtxDetail/nvtxImplCudaRt_v3.h
< include/nvtx3/nvtxDetail/nvtxImplCuda_v3.h
< include/nvtx3/nvtxDetail/nvtxImpl.h
< include/nvtx3/nvtxDetail/nvtxImplOpenCL_v3.h
< include/nvtx3/nvtxDetail/nvtxImplSync_v3.h
< include/nvtx3/nvtxDetail/nvtxInitDecls.h
< include/nvtx3/nvtxDetail/nvtxInitDefs.h
< include/nvtx3/nvtxDetail/nvtxInit.h
< include/nvtx3/nvtxDetail/nvtxLinkOnce.h
< include/nvtx3/nvtxDetail/nvtxTypes.h
1638,1644d1606
< lib/cmake/fmt/fmt-config.cmake
< lib/cmake/fmt/fmt-config-version.cmake
< lib/cmake/fmt/fmt-targets.cmake
< lib/cmake/fmt/fmt-targets-release.cmake
< lib/cmake/nvtx3/nvtx3-config.cmake
< lib/cmake/nvtx3/nvtx3-config-version.cmake
< lib/cmake/nvtx3/nvtx3-targets.cmake
1649,1652d1610
< lib/libfmt.so
< lib/libfmt.so.11
< lib/libfmt.so.11.0.2
< lib/pkgconfig/fmt.pc

Not including the cmake files seems broadly fine. The missing libfmt and header files seem less fine.

@vyasr
Copy link
Contributor

vyasr commented Feb 10, 2025

Any thoughts on my question about CBC->variants naming?

@gforsyth
Copy link
Contributor Author

Any thoughts on my question about CBC->variants naming?

Let me push up a commit and we can see how we like it

@vyasr
Copy link
Contributor

vyasr commented Feb 10, 2025

The changes in the librmm package are concerning. It looks like the CMake install rules for dependent package builds (i.e. those cloned by spdlog) are not being applied. The install step for the librmm package does seem to include these packages, but somehow they're not making it into the final package. If it was only happening to fmt I might think that it had something to do with rattler-build handling clobbering differently from conda-build, but I'm pretty sure that some of the missing nvtx headers simply don't exist in the CTK yet so I don't think we would be clobbering preinstalled nvtx headers. Maybe we need some more explicit include/exclude statements? Perhaps rattler-build is inferring something different from what we expect.

@gforsyth
Copy link
Contributor Author

Let me push up a commit and we can see how we like it

scratch that for the moment -- it seems like selectors don't work in variants.yaml.

I'm still trying to figure this out, but if `fmt` is in `run` but not in
`host`, then the header files and shared objects DO get copied into the package.
@vyasr
Copy link
Contributor

vyasr commented Feb 10, 2025

Let me push up a commit and we can see how we like it

scratch that for the moment -- it seems like selectors don't work in variants.yaml.

Hmmm that seems wrong to me. Are you sure that the selectors in the CBC are actually working, vs being silently ignored? I wouldn't expect that a priori, but that seems at least as likely to me than them having implemented a compatibility layer given that they've been pretty clear about preferring to break compatibility to establish clear new behaviors.

We should probably switch to their preferred new syntax either way.

Update: From looking at their docs, it does seem like there is some intentional compatibility baked in. It's not clear how much to expect without looking at the code, though.

@gforsyth
Copy link
Contributor Author

well, if we remove fmt from host but leave it in run, then things get copied over correctly.

still missing all the nvtx3 files and even if I use the always_include_files directive for include/nvtx3/*, nothing shows up, so something is off

🐚 colordiff nightly-librmm rattler-librmm-nohost
15,31d14
< include/nvtx3/nvToolsExtCuda.h
< include/nvtx3/nvToolsExtCudaRt.h
< include/nvtx3/nvToolsExt.h
< include/nvtx3/nvToolsExtOpenCL.h
< include/nvtx3/nvToolsExtSync.h
< include/nvtx3/nvtx3.hpp
< include/nvtx3/nvtxDetail/nvtxImplCore.h
< include/nvtx3/nvtxDetail/nvtxImplCudaRt_v3.h
< include/nvtx3/nvtxDetail/nvtxImplCuda_v3.h
< include/nvtx3/nvtxDetail/nvtxImpl.h
< include/nvtx3/nvtxDetail/nvtxImplOpenCL_v3.h
< include/nvtx3/nvtxDetail/nvtxImplSync_v3.h
< include/nvtx3/nvtxDetail/nvtxInitDecls.h
< include/nvtx3/nvtxDetail/nvtxInitDefs.h
< include/nvtx3/nvtxDetail/nvtxInit.h
< include/nvtx3/nvtxDetail/nvtxLinkOnce.h
< include/nvtx3/nvtxDetail/nvtxTypes.h
1642,1644d1624
< lib/cmake/nvtx3/nvtx3-config.cmake
< lib/cmake/nvtx3/nvtx3-config-version.cmake
< lib/cmake/nvtx3/nvtx3-targets.cmake

@vyasr
Copy link
Contributor

vyasr commented Feb 10, 2025

well, if we remove fmt from host but leave it in run, then things get copied over correctly.

OK interesting that certainly implicates clobbering to some degree.

still missing all the nvtx3 files and even if I use the always_include_files directive for include/nvtx3/*, nothing shows up, so something is off

Maybe worth double-checking if there are any nvtx-related files in the build environment coming from other packages. Just to rule out clobbering as a possible root cause.

@gforsyth
Copy link
Contributor Author

Update: From looking at their docs, it does seem like there is some intentional compatibility baked in. It's not clear how much to expect without looking at the code, though.

It does seem to respect the CBC selectors:

with $RAPIDS_CUDA_VERSION=11.8

      "variant": {
        "c_compiler_version": "11",
        "c_stdlib": "sysroot",
        "c_stdlib_version": "2.28",
        "cmake_version": ">=3.26.4,!=3.30.0",
        "cuda_compiler": "nvcc",
        "cxx_compiler_version": "11",
        "librmm": "25.04.00 cuda11_250210_5d0a2446",
        "target_platform": "linux-64"
      },

with $RAPIDS_CUDA_VERSION=12.8

      "variant": {
        "c_compiler_version": "13",
        "c_stdlib": "sysroot",
        "c_stdlib_version": "2.28",
        "cmake_version": ">=3.26.4,!=3.30.0",
        "cuda_compiler": "cuda-nvcc",
        "cxx_compiler_version": "13",
        "librmm": "25.04.00 cuda12_250210_5d0a2446",
        "target_platform": "linux-64"
      },

@gforsyth
Copy link
Contributor Author

Ok, so I'm not sure what's up with my LOCAL build environment, but I just pulled down the latest .conda package and looked over the manifest and it's identical to the rmm nightly build.

I'll investigate whether my changes are even required for the fmt fix.

@gforsyth
Copy link
Contributor Author

Ok, so the changes removing fmt from the host environment ARE required to make sure those headers and shared objects get included.

the nvtx stuff seems to be a me problem, but it's working fine in CI

- "test -d \"${PREFIX}/include/rmm\""
about:
homepage: ${{ load_from_file("python/librmm/pyproject.toml").project.urls.Homepage }}
license: ${{ load_from_file("python/librmm/pyproject.toml").project.license.text | replace(" ", "-") }}
Copy link
Contributor

@bdice bdice Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably change python/librmm/pyproject.toml to use Apache-2.0 as an SPDX identifier. https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license

Like this:

- license = { text = "Apache 2.0" }
+ license = "Apache-2.0"

@vyasr Do you know why we chose to write it this way in #1529?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably fix this across the board -- I would suggest raising this with ops (in case they know of restrictions that I do not know). If ops is supportive, let's open a build-planning issue and audit this for all repos. Non-OSS repos may need a different solution, but Apache/BSD-3 repos should be fixable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then this will look like:

Suggested change
license: ${{ load_from_file("python/librmm/pyproject.toml").project.license.text | replace(" ", "-") }}
license: ${{ load_from_file("python/librmm/pyproject.toml").project.license }}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the current choices were made in order to guarantee compatibility with wheeltamer. @raydouglass may remember the exact list of "allowed" licenses. Given that we no longer run wheeltamer before releases that is a moot point, but I don't know if there are any subsequent scans that we have reinstated where this could still be a problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good -- that's exactly what I wondered. If we can adopt a normal SPDX license identifier in our pyproject.toml files, we absolutely should. I am okay with that being a follow-up to this PR, though.

conda/recipes/rmm/recipe.yaml Outdated Show resolved Hide resolved
conda/recipes/rmm/recipe.yaml Outdated Show resolved Hide resolved
conda/recipes/rmm/recipe.yaml Outdated Show resolved Hide resolved
- python:
imports:
- rmm
pip_check: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the details @vyasr. Let's copy-paste this into a build-planning issue or something that we can use to plan future work. I think we do have worthwhile action items here -- passing pip check would be a very nice-to-have validation of our packaging.

@bdice
Copy link
Contributor

bdice commented Feb 10, 2025

@gforsyth Conflicts will need to be resolved with #1808 -- we basically need to drop spdlog / fmt dependencies and add rapids-logger.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with a minor fix and one question about making all env vars required.

Follow-up work:

  • SPDX licenses in pyproject.toml files
  • Enabling pip check

cuda_version: ${{ (env.get("RAPIDS_CUDA_VERSION") | split("."))[:2] | join(".") }}
cuda_major: ${{ (env.get("RAPIDS_CUDA_VERSION") | split("."))[0] }}
date_string: ${{ env.get("RAPIDS_DATE_STRING") }}
head_rev: ${{ git.head_rev( "." )[:8] }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need spaces around the parameter.

Suggested change
head_rev: ${{ git.head_rev( "." )[:8] }}
head_rev: ${{ git.head_rev(".")[:8] }}

SCCACHE_REGION: ${{ env.get("SCCACHE_REGION", default="") }}
SCCACHE_S3_USE_SSL: ${{ env.get("SCCACHE_S3_USE_SSL", default="") }}
SCCACHE_S3_NO_CREDENTIALS: ${{ env.get("SCCACHE_S3_NO_CREDENTIALS", default="") }}
SCCACHE_S3_KEY_PREFIX: librmm-${{ env.get("RAPIDS_CONDA_ARCH", default="") }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want a default here. Otherwise we might get behavior that tries to read/write librmm- when the default "" is used. That would be a bug.

Can we make all of these env vars required? Certainly all the sccache-related ones should be mandatory imo, as they force our CI to be correct. For various reasons, such as our deep reliance on gha-tools, we don't expect recipes to be built outside of our CI images, so I am okay with having the recipes enforce correctness in CI.

cuda_major: ${{ (env.get("RAPIDS_CUDA_VERSION") | split("."))[0] }}
date_string: ${{ env.get("RAPIDS_DATE_STRING") }}
py_version: ${{ env.get("RAPIDS_PY_VERSION") }}
head_rev: ${{ git.head_rev( "." )[:8] }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
head_rev: ${{ git.head_rev( "." )[:8] }}
head_rev: ${{ git.head_rev(".")[:8] }}

Comment on lines +30 to +31
-c rapidsai \
-c rapidsai-nightly \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we actually need both of these channels or if we should tighten things up using rapids-is-release-build to select one or the other.

- if: cuda_major == "11"
then: cuda-cudart-dev
about:
homepage: ${{ load_from_file("python/librmm/pyproject.toml").project.urls.Homepage }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we load_from_file once into something in the context and then access that data? I don't know if rattler is smart enough to cache the file on its own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci conda improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
Status: Review
Development

Successfully merging this pull request may close these issues.

4 participants