how are barriers/fences and acquire, release semantics implemented microarchitecturally?
Much of this has been covered in other Q&As (especially the later C++ How is release-and-acquire achieved on x86 only using MOV?), but I’ll give a summary here. Still, good question, it’s useful to collect this all in one place. On x86, every asm load is an acquire-load. To implement that efficiently, modern x86 HW … Read more