-
Notifications
You must be signed in to change notification settings - Fork 855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about algorithms used for all-gathers and reduce-scatters #1594
Comments
CollNet and NVLS are out for sure. I think PAT could be an option though? You can verify which algorithm NCCL chooses for every collective by running with |
Thank you so much! Follow up question - what does PAT do? And how should I model it's latency and message transmission times. For example I know that the latency of ring algorithm is O(G) and transmission time is O( [G-1]/G * N), where G is the number of GPUs and N is the size of the output buffer (for all gather) |
Also, if there are two available options, how does NCCL choose which one to actually use? |
PAT is a binomial tree algorithm. There's a bit of info here: https://developer.nvidia.com/blog/new-scaling-algorithm-and-initialization-with-nvidia-collective-communications-library-2-23/#pat_logarithmic_scaling_for_reducescatter_and_allgather%C2%A0 NCCL has an internal performance model for each algorithm and for each collective operation picks the algorithm/protocol combo that it expects to perform best under the circumstances. |
nccl/src/device/generate.py
Lines 76 to 83 in 80f6bda
Based on these lines of code, I can see that all-gathers and reduce-scatters can employ RING, COLLNET_DIRECT, NVLS, and PAT. I am working on the Perlmutter supercomputer (https://docs.nersc.gov/systems/perlmutter/architecture/) which has HPE's slingshot interconnect. Would that mean that our of these four options only RING is applicable for all-gathers/reduce-scatters since the other three depend on having Infiniband and Nvswitches?
The text was updated successfully, but these errors were encountered: