forked from aws/aws-ofi-nccl
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
tuner: replacing model with regions-based tuner
This commit is replacing the model-based tuner with a regions-based one. The tuner is now using a set of regions to choose what algorithm and protocol combination to use. A region is an ordered list of vertices defining a polygon in the 2D space (Message size X number of ranks). A region covers all the points where the corresponding algorithm+protocol combination should be chosen, and every combination that can be chosen has its region. When NCCL invokes get_coll_info() the tuner scans all the regions and finds the first one which has the specific (message size, number of ranks) point inside or on its edge. Therefore, regions can overlap and their order is important. We use the ray-tracing algorithm to find if a point belongs to a region. Moreover, we extend regions for larger message sizes and higher number of ranks by extending the external facing sides. Any point that does not belong to any region will result in falling back to NCCL's internal tuner. We also have different sets of regions for different ratios of (num_ranks/num_nodes). Signed-off-by: Amedeo Sapio <[email protected]>
- Loading branch information
1 parent
2c7eff0
commit bacae0c
Showing
11 changed files
with
721 additions
and
845 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.