feat: Generate meta.yaml dependencies #45

vyasr · 2023-03-28T14:08:55Z

This PR is an alternative to #28 that is based on generating separate dependency files instead of modifying meta.yaml in place. It is a much simpler changeset as a result. I considered putting the data into the conda_build_config.yaml file instead of using separate YAML files for each dependency set, but I rejected that option because dfg is based around writing separate files sections for every include list and the only way to map that to a single conda_build_config.yaml file is to do a lot of in-place overwriting, which we have generally decided to avoid unless absolutely necessary, i.e. for pyproject.toml (otherwise we could just do that for meta.yaml too). Willing to be convinced otherwise though.

In order to support split recipes, this approach requires writing separate files for every section in every output, which may be too verbose for our liking. Curious to hear opinions there.

vyasr · 2023-03-28T14:09:15Z

A partial example of this can be seen in rapidsai/cudf#13022

vyasr · 2023-03-28T14:11:55Z

src/rapids_dependency_file_generator/schema.json

-            "if": {
-                "properties": { "table": { "const": "project.optional-dependencies" } }
-            },
-            "then": {
-                "required": ["key"]
-            },
-            "else": {
-                "not": {
-                    "required": ["key"]
-                }


I was having trouble getting this conditional to behave correctly. According to the official JSON Schema docs it should have been sufficient to add a "required": ["table"] to the if statement, but no matter what I did the validator insisted that the key property was required when only the section key was present. I'll need to revisit this but didn't want to block the rest of the review on a schema issue.

CC @csadorf if you have thoughts here.

So when you simply added a "required": ["table"] to the "if" block and removed the "then" and "else" blocks it would always require it? That seems indeed very odd. Can you provide some valid/invalid examples that we can test against?

vyasr · 2023-03-28T14:22:45Z

For versions that need to be in conda_build_config.yaml, one thought that I had is that I'm fairly certain that there's no specific reason you have to encode only a version in that file, you could also include the package names themselves. For instance, in this first example I believe the following would be equally valid:

python:
    - python 2.7
    - python 3.5

with

package:
    name: compiled-code
    version: 1.0

requirements:
    build:
        - {{ python }}
    run:
        - {{ python }}

If that works then we ought to be able to generate the entire config file with dfg. Not sure if we want to go that route or if that is in scope for this PR, though.

vyasr · 2023-03-28T14:23:49Z

We should also keep in mind the alternative proposed by @jakirkham in rapidsai/rmm#1220. I've already outlined my thoughts on the pros and cons there, so just restarting that conversation now.

ajschmidt8

I like this approach a lot better.

I'll summarize our huddle about this PR here for posterity:

I had some concerns about how we'd account for these Jinja functions provided by conda build.

The results of our conversation were:

We can likely drop the use of pin_compatible altogether and just manually manage those version relationships wherever necessary. You mentioned we have to do this for wheels anyway
We'd like to continue using pin_subpackage since we won't be able to emulate it's exact=True flag functionality (which is pretty much the only flag that use)
We can likely drop the use of compiler altogether assuming:
- We can identify the packages that result from calling compiler() for different architectures
- Conda provides us with some way to load external files with variable names (e.g. meta_dependencies_x86_64_build.yaml vs meta_dependencies_aarch64_build.yaml)

ajschmidt8 · 2023-03-29T22:45:20Z

After reading about the compiler() function in the two sections below, I don't think we need to use them at all.

It seems like that function is mostly used for cross-compiling, which we don't do (we natively compile on amd64 and arm64).

From these conda-forge docs, it seems that compiler('cuda') is only responsible for adding nvcc to the build requirements and also adding cudatoolkit to the run requirements. But we generally ignore those run requirements anyway (1,2) since we manage that ourselves.

So I'm assuming we can just replace compiler('cuda') with nvcc.

@jakirkham, can you corroborate all this?

csadorf

Overall this seems like a rather trivial change and I would have no major objections. But I'm wondering whether it is too trivial, considering that a conda_meta file can be fairly complex.

Can you add at least one valid example input and expected output (and thus test)?

How does this mesh with tools like Grayskull?

csadorf · 2023-03-29T22:50:01Z

src/rapids_dependency_file_generator/constants.py

 default_conda_dir = "conda/environments"
+default_conda_meta_dir = "conda/recipes/"


Any reason why to use a trailing backslash here, but not for the other dirs?

csadorf · 2023-04-10T18:54:18Z