Skip to content

UCC Version 1.1.0 - RC1

Pre-release
Pre-release
Compare
Choose a tag to compare
@manjugv manjugv released this 07 Sep 15:40
· 3 commits to v1.1.x since this release
9f22d78

1.1.0

Features

API

  • Added float 128 and float 32, 64, 128 (complex) data types
  • Added Active Sets based collectives to support dynamic groups as well as point-to-point messaging

Core

  • Config file support
  • Fixed component search

CL

  • Added split rail all reduce collective implementation
  • Enable hierarchical alltoallv
  • Fixed cleanup bugs

TL

  • Added SELF TL supporting team size one

UCP

  • Added service broadcast
  • Added reduce_scatterv ring algorithm
  • Added k-nomial based gather collective implementation
  • Added one-sided get based algorithms

SHARP

  • Fixed SHARP OOB
  • Added SHARP broadcast

GPU Collectives (CUDA, NCCL TL and RCCL TL)

  • Added support for CUDA TL (intranode collectives for NVIDIA GPUs)
  • Added multiring allgatherv, alltoall in CUDA TL
  • Added NCCL gather, scatter and its vector variant
  • Enable using multiple streams for collectives
  • Added support for RCCL gather (v), scatter (v), broadcast, allgather (v), barrier, alltoall (v) and all reduce collectives
  • Added ROCm memory component
  • Adapted all GPU collectives to executor design

Tests

  • Added tests for triggered collectives in perftests
  • Fixed bugs in multi-threading tests

Utils

  • Added CPU model and vendor detection
  • Several bug fixes in all components