Half-precision floating-point arithmetic on Intel chips

related: https://scicomp.stackexchange.com/questions/35187/is-half-precision-supported-by-modern-architecture – has some info about BFloat16 in Cooper Lake and Sapphire Rapids, and some non-Intel info. Sapphire Rapids will have both BF16 and FP16, with FP16 using the same IEEE754 binary16 format as F16C conversion instructions, not brain-float. And AVX512-FP16 has support for most math operations, unlike BF16 which just has conversion to/from … Read more

Do 128bit cross lane operations in AVX512 give better performance?

Generally yes, in-lane is still lower latency on SKX (1 cycle vs. 3), but usually it’s not worth spending extra instructions to use them instead of the powerful lane-crossing shuffles. However, vpermt2w and a couple other shuffles need multiple shuffle-port uops, so they cost as much as multiple simpler shuffles. Shuffle throughput very easily becomes … Read more

Does Skylake need vzeroupper for turbo clocks to recover after a 512-bit instruction that only reads a ZMM register, writing a k mask?

No, a vpcmpeqb into a mask register does not trigger slow mode if you use a zmm register as one of the comparands, at least on SKX. This is also true of any of any other instruction (as far as I tested) which only reads the key 512-bit registers (the key registers being zmm0 – … Read more

What EXACTLY is the difference between intel’s and amd’s ISA, if any?

Yes, the ISA is a document / specification, not hardware. Implementing all of it correctly is what makes something an x86 CPU, rather than just something with similarities to x86. See the x86 tag wiki for links to the official docs (Intel’s manuals). Intel and AMD’s implementations of the x86 ISA differ mainly in performance, … Read more

Android emulator system images and AMD processor

According to the Android documentation for the emulator, the x86 image specifically requires an Intel processor. When they say: …Virtual machine acceleration for Windows requires the installation of the Intel Hardware Accelerated Execution Manager (Intel HAXM). The software requires an Intel CPU with Virtualization Technology (VT) support… They are referring not just to supporting “Virtualization”, … Read more