Half-precision floating-point arithmetic on Intel chips

related: https://scicomp.stackexchange.com/questions/35187/is-half-precision-supported-by-modern-architecture – has some info about BFloat16 in Cooper Lake and Sapphire Rapids, and some non-Intel info. Sapphire Rapids will have both BF16 and FP16, with FP16 using the same IEEE754 binary16 format as F16C conversion instructions, not brain-float. And AVX512-FP16 has support for most math operations, unlike BF16 which just has conversion to/from … Read more

Why is there no 2-byte float and does an implementation already exist?

TL;DR: 16-bit floats do exist and there are various software as well as hardware implementations There are currently 2 common standard 16-bit float formats: IEEE-754 binary16 and Google’s bfloat16. Since they’re standardized, obviously anyone who knows the spec can write an implementation. Some examples: https://github.com/ramenhut/half https://github.com/minhhn2910/cuda-half2 https://github.com/tianshilei1992/half_precision https://github.com/acgessler/half_float Or if you don’t want to use … Read more