Bit popcount for large buffer, with Core 2 CPU (SSSE3)
See a 32 bit version in the AMD Software Optimization guide, page 195 for one implementation. This gives you assembly code for an x86 directly. See a variant at Stanford bit-twiddling hacks The Stanford version looks like the best one to me. It looks very easy to code as x86 asm. Neither of these use … Read more