Skip to content

Creating jobspec via the Python API for cores and GPUs that share a socket #3152

Discussion options

You must be logged in to vote

I confirmed that with the merging of #3151, this is now possible in v0.19.0 with some small modifications to the original example. It turns out that hwloc v1 doesn't put the GPUs and Cores under the same socket but instead under the same Numanode, so after swapping socket out for numanode in the generated jobspec, this appears to work as intended on Lassen.

The magic run line is (note the FLUXION env var at the start which explicitly tells it to allow numanodes in the resource graph):

FLUXION_RESOURCE_OPTIONS="load-allowlist=node,core,gpu,numanode" PMIX_MCA_gds="^ds12,ds21" jsrun -a 1 -c ALL_CPUS -g ALL_GPUS -n 1 --bind=none --smpiargs="-disable_gpu_hooks" flux start ./submit-get-R.sh

Wh…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by SteVwonder
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant