You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CPUs and GPUs have added support for 16 bit and 8 bit floating point datatypes. SHMEM should support them.
Proposed Changes
Propose adding support for important new datatypes. BF16 and FP16 are candidates.
FP8 is a less likely possibility.
Support is easy for data movement APIs, Put, Get, Collect, Broadcast, Alltoall
Support for computational atomics is easy if there is hardware support.
Support for Reductions is a good question for discussion. I think it might be useful to have a new kind of reduction in which the target buffer is a different datatype than the source. One could do a sum-reduct from BF16 to FP32 for example, to retain more precision.
Impact on Implementations
Impact on Users
Adds new capabilities, no impact on existing applications.
Problem Statement
CPUs and GPUs have added support for 16 bit and 8 bit floating point datatypes. SHMEM should support them.
Proposed Changes
Propose adding support for important new datatypes. BF16 and FP16 are candidates.
FP8 is a less likely possibility.
Support is easy for data movement APIs, Put, Get, Collect, Broadcast, Alltoall
Support for computational atomics is easy if there is hardware support.
Support for Reductions is a good question for discussion. I think it might be useful to have a new kind of reduction in which the target buffer is a different datatype than the source. One could do a sum-reduct from BF16 to FP32 for example, to retain more precision.
Impact on Implementations
Impact on Users
Adds new capabilities, no impact on existing applications.
References and Pull Requests
The Great 8 Bit Debate of Artificial Intelligence
BF16 vs FP16
The text was updated successfully, but these errors were encountered: