Arm Neon Intrinsics vs hand assembly
My experience is that the intrinsics haven’t really been worth the trouble. It’s too easy for the compiler to inject extra register unload/load steps between your intrinsics. The effort to get it to stop doing that is more complicated than just writing the stuff in raw NEON. I’ve seen this kind of stuff in pretty … Read more