NVIDIA: Adding cuPQC as a backend for ML-KEM. #2044

stevenireeves · 2025-01-15T15:47:08Z

This PR adds the support for the NVIDIA library cuPQC to be used as the backend for ML-KEM algorithms.

cuPQC requires the use of an NVIDIA GPU to perform the PQC algorithms.

Does this PR change the input/output behaviour of a cryptographic algorithm (i.e., does it change known answer test values)? (If so, a version bump will be required from x.y.z to x.(y+1).0.)
Does this PR change the list of algorithms available -- either adding, removing, or renaming? Does this PR otherwise change an API? (If so, PRs in fully supported downstream projects dependent on these, i.e., oqs-provider will also need to be ready for review and merge by the time this is merged.)

Signed-off-by: Steven Reeves <[email protected]>

stevenireeves · 2025-01-15T15:48:04Z

@praveksharma @neil-lindquist for visibility.

stevenireeves · 2025-01-15T15:52:24Z

@praveksharma looks like a number of github actions are failing, can you help with this, or point to what needs to be changed?

baentsch · 2025-01-15T16:13:55Z

@stevenireeves For the code formatting errors, please review this. For the basic test error (and error, actually), please open the "twisty" in the CI error report e.g. here at the "Configure" step to see the config command executed: You can run locally and see the problem: The new config variable apparently isn't initialized in all cases:

Signed-off-by: Steven Reeves <[email protected]>

praveksharma · 2025-01-15T19:03:41Z

@stevenireeves I've pushed 3 commits to https://github.com/open-quantum-safe/liboqs/tree/libOQSxcuPQC, which builds off of your branch, to get it to pass CI. I can't seem to push the changes directly to your fork.

Here's how to re-produce the changes:

Commit 1:

export LIBOQS_DIR=`pwd`
cd ./scripts/copy_from_upstream
python3 ./copy_from_upstream.py copy

Commit 2:

./tests/run_astyle.sh --no-dry-run

Commit 3: replace spaces with tabs in scripts/copy_from_upstream/src/kem/family/kem_scheme.c as in commit.

…_####.c and kem/family/kem_scheme.c Signed-off-by: Steven Reeves <[email protected]>

praveksharma · 2025-01-15T20:54:33Z

Thank you for making these changes @stevenireeves! Since I reviewed the private PR initially I'll let this one be reviewed by reviewers other than myself.

dstebila · 2025-01-15T21:53:20Z

I talked to @praveksharma just now to understand some of the approach here, and I understand why the cupqc metadata is patched into the PQ Crystals meta.yml file (since there isn't another meta.yml to patch it into); so that makes sense. But I think a few other pieces could be done differently. I don't think the *.cu files need to be added via the patch file; couldn't they just be added directly without also being added through the patch? And if so then could we give those directories a name that doesn't include pqcrystals -- e.g., just src/kem/ml_kem/cupqc-ml-kem-768?

stevenireeves · 2025-01-15T22:22:11Z

I talked to @praveksharma just now to understand some of the approach here, and I understand why the cupqc metadata is patched into the PQ Crystals meta.yml file (since there isn't another meta.yml to patch it into); so that makes sense. But I think a few other pieces could be done differently. I don't think the *.cu files need to be added via the patch file; couldn't they just be added directly without also being added through the patch? And if so then could we give those directories a name that doesn't include pqcrystals -- e.g., just src/kem/ml_kem/cupqc-ml-kem-768?

@dstebila with this organizational change will we need to mess with the copy_from_upstream stuff as well?

praveksharma · 2025-01-15T22:24:10Z

@dstebila with this organizational change will we need to mess with the copy_from_upstream stuff as well?

@stevenireeves I shall work on this and attemp to push the changes directly to your fork. I don't believe copy_from_upstream.py would need to be changed significantly.

stevenireeves · 2025-01-15T22:27:35Z

@dstebila with this organizational change will we need to mess with the copy_from_upstream stuff as well?

@stevenireeves I shall work on this and attemp to push the changes directly to your fork. I don't believe copy_from_upstream.py would need to be changed significantly.

I have the allow edits from maintainers option selected. So should be able to make those changes.

Signed-off-by: Pravek Sharma <[email protected]>

praveksharma · 2025-01-15T23:16:52Z

I don't think the *.cu files need to be added via the patch file; couldn't they just be added directly without also being added through the patch? And if so then could we give those directories a name that doesn't include pqcrystals -- e.g., just src/kem/ml_kem/cupqc-ml-kem-768?

@dstebila handling the naming shouldn't be an issue. The *.cu file must be sourced with every run of copy_from_upstream.py under delete mode, since there is no upstream the *.cu must be sourced from within the liboqs repo - a patch is the most straightforward way of doing this. Would a separate patch file for the *.cu file be adequate?

baentsch · 2025-01-16T07:16:49Z

I understand why the cupqc metadata is patched into the PQ Crystals meta.yml file (since there isn't another meta.yml to patch it into); so that makes sense

Hmm -- it doesn't to me: We have #2041 in the pipeline with the declared goal of removing the PQCrystals files. So this PR thus would need to be re-done after that landed, right? That doesn't seem sensible.

Wouldn't it be much more sensible to have this code be contained in an upstream of its own to pull it from? As far as I can see, this is effectively a different implementation with different license terms, characteristics etc. Such split also would make responsibilities clear: NVIDIA is to support the interfacing to its library (that may very well change over time unbeknownst to OQS) and OQS is responsible for the proper operation within/integration into the OQS APIs (that may change unbeknownst to NVIDIA). Or is the intention by NVIDIA to become committed, well committers and/or maintainers to OQS @stevenireeves ?

stevenireeves · 2025-01-16T15:39:02Z

Hmm -- it doesn't to me: We have #2041 in the pipeline with the declared goal of removing the PQCrystals files. So this PR thus would need to be re-done after that landed, right? That doesn't seem sensible.

Wouldn't it be much more sensible to have this code be contained in an upstream of its own to pull it from? As far as I can see, this is effectively a different implementation with different license terms, characteristics etc. Such split also would make responsibilities clear: NVIDIA is to support the interfacing to its library (that may very well change over time unbeknownst to OQS) and OQS is responsible for the proper operation within/integration into the OQS APIs (that may change unbeknownst to NVIDIA). Or is the intention by NVIDIA to become committed, well committers and/or maintainers to OQS @stevenireeves ?

Nvidia maintaining it's own fork of liboqs is not what we want to do. The intention of NVIDIA is to support libOQS on portions of liboqs that utilize cuPQC as it's backend. So if there are significant changes that require alteration of the source files related to cupqc in liboqs, we will help there (so long as we have the internal support). I believe that @praveksharma is in the process of removing the cupqc metadata from PQ Crystals meta.yaml, correct me if I'm wrong.

dstebila · 2025-01-16T15:53:18Z

I don't think the *.cu files need to be added via the patch file; couldn't they just be added directly without also being added through the patch? And if so then could we give those directories a name that doesn't include pqcrystals -- e.g., just src/kem/ml_kem/cupqc-ml-kem-768?

@dstebila handling the naming shouldn't be an issue. The *.cu file must be sourced with every run of copy_from_upstream.py under delete mode, since there is no upstream the *.cu must be sourced from within the liboqs repo - a patch is the most straightforward way of doing this. Would a separate patch file for the *.cu file be adequate?

Oh, I see now. I knew that we have some local implementations that --delete doesn't wipe out (e.g., FrodoKEM), but these algorithms aren't handled at all by copy_from_upstream. But I understand now that for algorithms that are handled by copy_from_upstream, there's no mixing of local (not deleted) and upstream (deleted) implementations.

So I see why the patch file is adding these.

dstebila · 2025-01-16T15:56:51Z

Wouldn't it be much more sensible to have this code be contained in an upstream of its own to pull it from? As far as I can see, this is effectively a different implementation with different license terms, characteristics etc. Such split also would make responsibilities clear: NVIDIA is to support the interfacing to its library (that may very well change over time unbeknownst to OQS) and OQS is responsible for the proper operation within/integration into the OQS APIs (that may change unbeknownst to NVIDIA). Or is the intention by NVIDIA to become committed, well committers and/or maintainers to OQS @stevenireeves ?

Nvidia maintaining it's own fork of liboqs is not what we want to do. The intention of NVIDIA is to support libOQS on portions of liboqs that utilize cuPQC as it's backend. So if there are significant changes that require alteration of the source files related to cupqc in liboqs, we will help there (so long as we have the internal support). I believe that @praveksharma is in the process of removing the cupqc metadata from PQ Crystals meta.yaml, correct me if I'm wrong.

I think these are talking about two slightly different things. Michael isn't suggesting that NVIDIA maintain a separate fork of liboqs. The way our code has been structured is that we have scripts to pull in source code from (self-contained) implementations of algorithms in other repositories, and add them to liboqs using code generation and patches.

This could be done that way, but since it is also only one file that is being added, I don't think it's worth the effort of setting up a separate upstream for that.

baentsch · 2025-01-16T15:57:02Z

Nvidia maintaining it's own fork of liboqs is not what we want to do.

This is not my ask: I only (meant to :) ask whether you'd want to maintain the wrapper code around the cuPQC code in a separate project -- complete with META.yml such as for copy_from_upstream to pull it in -- thus obviating the need for patching other implementations' META.yml files.

SWilson4 · 2025-01-16T16:00:51Z

I added the --delete functionality for copy_from_upstream quite recently. I can add an exception for the cuda files so that they don't get cleaned out; it would be quite straightforward and IMO would make maintenance of those files a lot easier. Editing patch files instead of source files is a pain. How does this sound @praveksharma?

praveksharma · 2025-01-16T16:03:59Z

I added the --delete functionality for copy_from_upstream quite recently. I can add an exception for the cuda files so that they don't get cleaned out; it would be quite straightforward and IMO would make maintenance of those files a lot easier. Editing patch files instead of source files is a pain. How does this sound @praveksharma?

Thank you for offering @SWilson4 but I think it is more prudent to create a separate git repository to store cupqc_ml-kem metadata since the pqcrystals upstream is going to be deprecated soon in any case. If things work out okay @stevenireeves or someone else from Nvidia can ownership of the repo.

stevenireeves · 2025-01-16T17:29:23Z

Thank you for offering @SWilson4 but I think it is more prudent to create a separate git repository to store cupqc_ml-kem metadata since the pqcrystals upstream is going to be deprecated soon in any case. If things work out okay @stevenireeves or someone else from Nvidia can ownership of the repo.

@praveksharma I am not sure what metadata you want us to store. Do you want us to write the wrapper code in an additional repository?

…metadata to separate upstream repo Signed-off-by: Pravek Sharma <[email protected]>

praveksharma · 2025-01-17T02:24:17Z

I think it is more prudent to create a separate git repository to store cupqc_ml-kem metadata since the pqcrystals upstream is going to be deprecated soon in any case

I've updated the import mechanism to use https://github.com/praveksharma/cupqc-mlkem as an upstream. Ideally I would like to modify copy_from_upstream to not require ad hoc upstreams for situations such as this.

Could you please review the updated PR @SWilson4 @dstebila @baentsch ?

stevenireeves · 2025-01-17T04:43:21Z

I've updated the import mechanism to use https://github.com/praveksharma/cupqc-mlkem as an upstream. Ideally I would like to modify copy_from_upstream to not require ad hoc upstreams for situations such as this.

Thanks @praveksharma

baentsch

Basically LGTM, @stevenireeves @praveksharma ; just some nits (see single comments). Also the question whether it would be worth while adding build documentation and a "tier 3" entry to PLATFORMS.md (as I guess we'll never be able to test this in CI, right?)

baentsch · 2025-01-17T07:20:28Z

CONFIGURE.md

@@ -124,6 +125,13 @@ Dynamically load OpenSSL through `dlopen`. When using liboqs from other cryptogr

 Only has an effect if the system supports `dlopen` and ELF binary format, such as Linux or BSD family.

+### OQS_USE_CUPQC
+
+Can be `ON` or `OFF`.  When `ON`, use NVIDIA's cuPQC library where able (currently just ML-KEM).  When this option is enabled, liboqs may not run correctly on machines that lack supported GPUs. To download cuPQC follow the instructions at (https://developer.nvidia.com/cupqc-download/). Detailed descriptions of the API, requirments, and installation guide are in the cuPQC documentation (https://docs.nvidia.com/cuda/cupqc/index.html). 


nit: typo "requirements"

baentsch · 2025-01-17T07:21:18Z

docs/algorithms/kem/ml_kem.md

@@ -9,6 +9,10 @@
 - **Primary Source**<a name="primary-source"></a>:
  - **Source**: https://github.com/pq-crystals/kyber/commit/10b478fc3cc4ff6215eb0b6a11bd758bf0929cbd with copy_from_upstream patches
  - **Implementation license (SPDX-Identifier)**: CC0-1.0 or Apache-2.0
+- **Optimized Implementation sources**: https://github.com/pq-crystals/kyber/commit/10b478fc3cc4ff6215eb0b6a11bd758bf0929cbd with copy_from_upstream patches
+  - **cupqc-cuda**:<a name="cupqc-cuda"></a>
+      - **Source**: https://github.com/praveksharma/cupqc-mlkem/commit/adb8454e56979628c07b67eb7d90f9337be6dc30


Is this going to move to a non-personal GH repo?

That is the goal @baentsch. Alternatively, I would like to modify to copy_from_upstream to store changes locally in the case of a minimal "upstream" such as this one.

baentsch · 2025-01-17T07:23:11Z

scripts/copy_from_upstream/copy_from_upstream.yml

@@ -38,6 +38,14 @@ upstreams:
    kem_meta_path: '{pretty_name_full}_META.yml'
    kem_scheme_path: '.'
    patches: [pqcrystals-ml_kem.patch]
+  - 
+    name: cupqc
+    git_url: https://github.com/praveksharma/cupqc-mlkem.git


Please eventually move to long-term maintained GH repo

SWilson4

I know we'd need to get additional hardware to run CI tests on the CUDA code, but it is possible to add a build-only job?

docs/algorithms/kem/ml_kem.yml

Signed-off-by: Pravek Sharma <[email protected]>

praveksharma · 2025-01-17T17:42:20Z

I know we'd need to get additional hardware to run CI tests on the CUDA code, but it is possible to add a build-only job?

I'm trying to do this on a maching without a Nvidia GPU and am running into trouble. Do you know if this is possible @stevenireeves?

Also the question whether it would be worth while adding build documentation and a "tier 3" entry to PLATFORMS.md (as I guess we'll never be able to test this in CI, right?)

I think we should do this. But I'm not familiar enough GPU architectures to know which specific platform to list. @stevenireeves, given cuPQC's requirements:

    CUDA Toolkit 12.4 or newer
    Supported CUDA compiler
    x86_64 CPU
    A NVIDIA GPU with one the following architectures: 70, 75, 80, 86, 89, 90

is listing "NVIDIA GPU architectures 70, 75, 80, 86, 89, and 90 with a x86_64 CPU" sufficient information for those in the know? For reference here are other Tier 3 platforms from PLATFORMS.md:

    x86 for Windows (Visual Studio Toolchain)
    ppc641e for Ubuntu (Focal)
    s390x for Ubuntu (Focal)
    loongarch64 for Debian Linux (trixie)

stevenireeves · 2025-01-17T19:23:43Z

@praveksharma

I'm trying to do this on a maching without a Nvidia GPU and am running into trouble. Do you know if this is possible @stevenireeves?

I think if you have the NVIDIA toolkit 12.4 (even though the machine does not have a GPU) you should be able to build.

I think we should do this. But I'm not familiar enough GPU architectures to know which specific platform to list. @stevenireeves, given cuPQC's requirements:
    CUDA Toolkit 12.4 or newer
    Supported CUDA compiler
    x86_64 CPU
    A NVIDIA GPU with one the following architectures: 70, 75, 80, 86, 89, 90
is listing "NVIDIA GPU architectures 70, 75, 80, 86, 89, and 90 with a x86_64 CPU" sufficient information for those in the know? For reference here are other Tier 3 platforms from PLATFORMS.md:

cuPQC only support Linux with x86_64 currently. We only officially support archs 70-90 but I don't see why any GPU made post 70 wouldn't. I think we could say Linux NVIDIA GPU architectures 70, 75, 80, 86, 89, and 90 with a x86_64 CPU in that list and be good.

In the future we can update if we support more OS/CPUs and GPUs.

Signed-off-by: Pravek Sharma <[email protected]>

praveksharma · 2025-01-17T19:45:47Z

I know we'd need to get additional hardware to run CI tests on the CUDA code, but it is possible to add a build-only job?

I think if you have the NVIDIA toolkit 12.4 (even though the machine does not have a GPU) you should be able to build.

This would require the CI images to updated with NVIDIA toolkit so I'll get started on that.

baentsch · 2025-01-19T07:09:38Z

This would require the CI images to updated with NVIDIA toolkit so I'll get started on that.

If there's no concrete desire to list this feature in a higher support tier (who wants that?) that'd be "future work" in my eyes, @praveksharma (or at least lower priority than many other open issues) and for me not a prerequisite for an approval of this PR.

Signed-off-by: Pravek Sharma <[email protected]>

stevenireeves · 2025-01-22T19:52:22Z

@praveksharma looks like the build with OQS_USE_CUPQC is still failing in this PR. Although I can't see what is causing the failures. Did Neil's comments in the email help?

praveksharma · 2025-01-22T20:07:52Z

@praveksharma looks like the build with OQS_USE_CUPQC is still failing in this PR. Although I can't see what is causing the failures. Did Neil's comments in the email help?

@stevenireeves Yes, the comments were super helpful. I can succesfully build locally, this is mostly likely an issue with the CI config itself that I'm still troubleshooting.

Signed-off-by: Pravek Sharma <[email protected]>

praveksharma · 2025-01-23T19:24:38Z

@stevenireeves cmake in CI is failing with this message:

CMake Error at /usr/share/cmake-3.28/Modules/CMakeDetermineCUDACompiler.cmake:270 (message):
Failed to detect a default CUDA architecture.

I am unable to reproduce this error locally in a docker container provisioned from the same image used in CI. Do you know how to fix this?

stevenireeves · 2025-01-23T19:28:19Z

CMake Error at /usr/share/cmake-3.28/Modules/CMakeDetermineCUDACompiler.cmake:270 (message):
Failed to detect a default CUDA architecture.

I am unable to reproduce this error locally in a docker container provisioned from the same image used in CI. Do you know how to fix this?

@praveksharma ah, because CI has no GPU it can't detect a default. Try adding this flag to the CMake command.
-DCMAKE_CUDA_ARCHITECTURES=80

Specifically for CI.

Signed-off-by: Pravek Sharma <[email protected]>

stevenireeves · 2025-01-23T20:17:51Z

@praveksharma looks like that CI test is still failing.

No CMAKE_CUDA_COMPILER could be found.

Tell CMake where to find the compiler by setting either the environment
variable "CUDACXX" or the CMake cache entry CMAKE_CUDA_COMPILER to the full
path to the compiler, or to the compiler name if it is in the PATH.

You'll need to tell CMAKE where nvcc is.

Signed-off-by: Pravek Sharma <[email protected]>

stevenireeves · 2025-01-23T21:32:19Z

@praveksharma looks like that did the trick. Any other comments by the reviewers?

SWilson4

LGTM, with the caveat that we should eventually move away from a personal repo, as noted by @baentsch above.

Adding cuPQC as a backend for ML-KEM.

fb2a8e5

Signed-off-by: Steven Reeves <[email protected]>

Fixing transposition error that left out OQS_USE_CUPQC in CMake system.

2cb8067

Signed-off-by: Steven Reeves <[email protected]>

Add CMake dependent options for cupqc. Fixed formatting in kem_ml_kem…

8c07675

…_####.c and kem/family/kem_scheme.c Signed-off-by: Steven Reeves <[email protected]>

stevenireeves marked this pull request as ready for review January 15, 2025 20:37

stevenireeves requested review from bhess, baentsch, alexrow, praveksharma and dstebila as code owners January 15, 2025 20:37

Move cupqc_ml-kem source files to correctly named dir

7562c85

Signed-off-by: Pravek Sharma <[email protected]>

Stop piggybacking on pqcrystals-kyber-standard and move cupqc_ml-kem …

73ba48f

…metadata to separate upstream repo Signed-off-by: Pravek Sharma <[email protected]>

baentsch reviewed Jan 17, 2025

View reviewed changes

SWilson4 reviewed Jan 17, 2025

View reviewed changes

docs/algorithms/kem/ml_kem.yml Outdated Show resolved Hide resolved

Update licensing information

cf46255

Signed-off-by: Pravek Sharma <[email protected]>

Update PLATFORMS.md

ae3fd37

Signed-off-by: Pravek Sharma <[email protected]>

praveksharma added 6 commits January 20, 2025 22:50

Fix kem_family cmakelists template

cf03b8d

Signed-off-by: Pravek Sharma <[email protected]>

Run copy_from_upsream.py and pull updated upstream

5dc3f49

Signed-off-by: Pravek Sharma <[email protected]>

Add cupqc build test to basic.yml

a3126fc

Signed-off-by: Pravek Sharma <[email protected]>

Move cupqc build test from basic.yml to linux.yml

0fa3031

Signed-off-by: Pravek Sharma <[email protected]>

Fix error in linux.yml

512d071

Signed-off-by: Pravek Sharma <[email protected]>

fixup! Fix error in linux.yml

7f2eb30

Signed-off-by: Pravek Sharma <[email protected]>

Redo cupqc build check

7fbdd13

Signed-off-by: Pravek Sharma <[email protected]>

Supply default CUDA arch to cupqc-buildcheck configuration stage

083b267

Signed-off-by: Pravek Sharma <[email protected]>

praveksharma added 2 commits January 23, 2025 21:22

Specify CUDAXX in cupqc-buildcheck

1ee9973

Signed-off-by: Pravek Sharma <[email protected]>

Make cuPQC_DIR explicit in cupqc-buildcheck

c94b7c7

Signed-off-by: Pravek Sharma <[email protected]>

SWilson4 approved these changes Jan 23, 2025

View reviewed changes

NVIDIA: Adding cuPQC as a backend for ML-KEM. #2044

Are you sure you want to change the base?

NVIDIA: Adding cuPQC as a backend for ML-KEM. #2044

Conversation

stevenireeves commented Jan 15, 2025

stevenireeves commented Jan 15, 2025

stevenireeves commented Jan 15, 2025

baentsch commented Jan 15, 2025

praveksharma commented Jan 15, 2025 • edited Loading

praveksharma commented Jan 15, 2025

dstebila commented Jan 15, 2025

stevenireeves commented Jan 15, 2025

praveksharma commented Jan 15, 2025

stevenireeves commented Jan 15, 2025

praveksharma commented Jan 15, 2025

baentsch commented Jan 16, 2025

stevenireeves commented Jan 16, 2025

dstebila commented Jan 16, 2025 • edited Loading

dstebila commented Jan 16, 2025

baentsch commented Jan 16, 2025

SWilson4 commented Jan 16, 2025

praveksharma commented Jan 16, 2025

stevenireeves commented Jan 16, 2025 • edited Loading

praveksharma commented Jan 17, 2025

stevenireeves commented Jan 17, 2025

baentsch left a comment

Choose a reason for hiding this comment

baentsch Jan 17, 2025

Choose a reason for hiding this comment

baentsch Jan 17, 2025

Choose a reason for hiding this comment

praveksharma Jan 17, 2025

Choose a reason for hiding this comment

baentsch Jan 17, 2025

Choose a reason for hiding this comment

SWilson4 left a comment

Choose a reason for hiding this comment

praveksharma commented Jan 17, 2025

stevenireeves commented Jan 17, 2025 • edited Loading

praveksharma commented Jan 17, 2025

baentsch commented Jan 19, 2025

stevenireeves commented Jan 22, 2025

praveksharma commented Jan 22, 2025 • edited Loading

praveksharma commented Jan 23, 2025

stevenireeves commented Jan 23, 2025 • edited Loading

stevenireeves commented Jan 23, 2025

stevenireeves commented Jan 23, 2025

SWilson4 left a comment • edited Loading

Choose a reason for hiding this comment

praveksharma commented Jan 15, 2025 •

edited

Loading

dstebila commented Jan 16, 2025 •

edited

Loading

stevenireeves commented Jan 16, 2025 •

edited

Loading

stevenireeves commented Jan 17, 2025 •

edited

Loading

praveksharma commented Jan 22, 2025 •

edited

Loading

stevenireeves commented Jan 23, 2025 •

edited

Loading

SWilson4 left a comment •

edited

Loading