Releases: ROCm/rccl
Releases · ROCm/rccl
rccl 2.17.1 for ROCm 5.7.0
Changed
- Compatibility with NCCL 2.17.1-1
- Performance tuning for some collective operations
Added
- Minor improvements to MSCCL codepath
- NCCL_NCHANNELS_PER_PEER support
- Improved compilation performance
- Support for gfx94x
Fixed
- Potential race-condition during ncclSocketClose()
rccl 2.16.2 for ROCm 5.6.1
RCCL code for ROCm 5.6.1 did not change. The library was rebuilt for the updated ROCm 5.6.1 stack.
rccl 2.16.2 for ROCm 5.6.0
Changed
- Compatibility with NCCL 2.16.2
Fixed
- Remove workaround and use indirect function call
rccl 2.15.5 for ROCm 5.5.1
RCCL code for ROCm 5.5.1 did not change. The library was rebuilt for the updated ROCm 5.5.1 stack.
RCCL 2.15.5 for ROCm 5.5.0
Changed
- Compatibility with NCCL 2.15.5
- Unit test executable renamed to rccl-UnitTests
Added
- HW-topology aware binary tree implementation
- Experimental support for MSCCL
- New unit tests for hipGraph support
- NPKit integration
Fixed
- rocm-smi ID conversion
- Support for HIP_VISIBLE_DEVICES for unit tests
- Support for p2p transfers to non (HIP) visible devices
Removed
- Removed TransferBench from tools. Exists in standalone repo: https://github.com/ROCmSoftwarePlatform/TransferBench
rccl 2.13.4 for ROCm 5.4.4
RCCL code for ROCm 5.4.4 did not change. The library was rebuilt for the updated ROCm 5.4.4 stack.
rccl 2.13.4 for ROCm 5.4.3
RCCL code for ROCm 5.4.3 did not change. The library was rebuilt for the updated ROCm 5.4.3 stack.
rccl 2.13.4 for ROCm 5.4.2
RCCL code for ROCm 5.4.2 did not change. The library was rebuilt for the updated ROCm 5.4.2 stack.
rccl 2.13.4 for ROCm 5.4.1
RCCL code for ROCm 5.4.1 did not change. The library was rebuilt for the updated ROCm 5.4.1 stack.
RCCL 2.13.4 for ROCm 5.4.0
Changed
- Compatibility with NCCL 2.13.4
- Improvements to RCCL when running with hipGraphs
- RCCL_ENABLE_HIPGRAPH environment variable is no longer necessary to enable hipGraph support
- Minor latency improvements
Fixed
- Resolved potential memory access error due to asynchronous memset