forked from open-mpi/ompi
-
Notifications
You must be signed in to change notification settings - Fork 0
Half precision floating point
KAWASHIMA Takahiro edited this page Dec 4, 2018
·
13 revisions
- Format
- IEEE
- ISO/IEC/IEEE JTC 1/SC 25
- C
- ISO/IEC JTC 1/SC 22 WG 14 (C WG)
- C++
- ISO/IEC JTC 1/SC 22 WG 21 (C++ WG
- Fortran
- ISO/IEC JTC 1/SC 22 WG 5 (FORTRAN WG)
- MPI
- MPI Forum
- IEEE 754-2008 - IEEE Standard for Floating-Point Arithmetic (2008-08-29)
- http://ieeexplore.ieee.org/servlet/opac?punumber=4610933
-
binary16
is defined as a format
- ISO/IEC/IEEE 60559:2011
- http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=57469
- Same as IEEE 754-2008
-
ISO/IEC JTC 1/SC 22/WG 14 N1945 (ISO/IEC TS 18661-3:2015): Information technology - Programming languages, their environments, and system software interfaces - Floating-point extensions for C -
- http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=65615 (2015-10)
- http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1945.pdf (2015-06-10)
- ISO/IEC TS (technical specification) which adds IEEE 754-2008 support to C
-
_Float16
and_Float16 _Complex
are defined as types for IEEE 754-2008binary16
- Extension of ISO/IEC 9899:2011
- Not included in ISO/IEC 9899:2017 (C18)
-
ISO/IEC JTC 1/SC 22/WG 14 N2016: Adding Fundamental Type for Short Float
- http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2016.pdf (2016-02-14)
- Same as ISO/IEC JTC 1/SC 22/WG 21 P0192R1 of C++
-
C++ Standards Committee Papers
-
ISO/IEC JTC 1/SC 22/WG 21 P0192R0: Adding Fundamental Type for Short Float
- http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0192r0.pdf (2015-11-11)
-
short float
is proposed for C and C++ as new float type, shorter than 32 bit - No description about complex types
-
ISO/IEC JTC 1/SC 22/WG 21 P0192R1: Adding Fundamental Type for Short Float
- http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0192r1.pdf (2016-02-14)
- Improved version of P0192R0
- Same as ISO/IEC JTC 1/SC 22/WG 14 N2016 of C
-
ISO/IEC JTC 1/SC 22/WG 21 P0303R0: Extensions to C++ for Short Float Type
- http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0303r0.pdf (2017-10-15)
- Based on P0192R1
- Modification points against current C++ working draft are shown
-
std::complex<short float>
is added for a C++ complex type
-
ISO/IEC JTC 1/SC 22/WG 21 P0192R4:
short float
and fixed-size floating point types- http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0192r4.html (2018-10-08)
- Improved version of P0192R1
-
short float
- Same as or shorter than
float
- Bit length is not specified
- Same as or shorter than
std::complex<short float>
-
std::float16_t
,std::float32_t
,std::float64_t
- IEEE 754-2008 -compliant types
- Bit length is specified
- GitHub issues
- 16-bit floating-point support for C/C++
- define language-agnostic, IEEE types|
- Data type naming rule
- Slides
- pt2pt wg FP16 MPI Forum Meeting Dec 4 - Dec 7, 2017, San Jose
- MPI Forum virtual meeting FP16 Jan 31, 2018
- AArch64 v8.2 FP16 extensions is supported for ARM starting from QEMU 2.12
-
_Float16
and__fp16
are supported in C and C++ for ARM starting from GCC 7- https://gcc.gnu.org/onlinedocs/gcc/Floating-Types.html (latest version)
- https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html (latest version)
- https://gcc.gnu.org/onlinedocs/gcc-7.1.0/gcc/Floating-Types.html (GCC 7.1.0)
- https://gcc.gnu.org/onlinedocs/gcc-7.1.0/gcc/Half-Precision.html (GCC 7.1.0)
-
_Float16
and__fp16
are supported in C and C++ for all(?) platforms starting from clang 6
-
GtiHub issue of
MPI_REAL4
andMPI_COMPLEX4
-
Prototype implemenataion of
MPI_SHORT_FLOAT
,MPI_C_SHORT_FLOAT_COMPLEX
, andMPI_CXX_SHORT_FLOAT_COMPLEX
- GitHub issue and PR for
MPIX_C_FLOAT16