Tutel v0.1.5
What's New in v0.1.5:
- Add 2D hierarchical a2a algorithm used for extremely-large scaling;
- Support different parallel_type for MoE computation: data, model, auto;
- Combine different expert granularities (e.g. normal, sharded experts, megatron dense ffn) into same programming interface & style;
- New features: is_postscore to specify whether gating scores are weighed during encoding or decoding;
- Enhance existing features: JIT compiler, a2a overlap with 2D.
How to Setup:
python3 -m pip install --user https://github.com/microsoft/tutel/archive/refs/tags/v0.1.5.tar.gz
Contributors: @abuccts, @yzygitzh, @ghostplant, @EricWangCN