The bit-twiddling hacks page has a number of options.
Of course, you could argue that iterating over all 32 possible bits is O(N) in that it’s the same cost every time 🙂
For simplicity, I’d consider the lookup-table-per-byte approach, or Brian Kernighan’s neat idea which iterates as many times as there are bits set, which I’d write as:
public static int CountBits(uint value)
{
int count = 0;
while (value != 0)
{
count++;
value &= value - 1;
}
return count;
}
If you don’t like the idea of populating a 256-entry lookup table, a lookup-per-nybble would still be pretty fast. Mind you, it’s possible that 8 array lookups might be slower than 32 simple bit operations.
Of course, it’s worth testing your app’s real performance before going to particularly esoteric approaches… is this really a bottleneck for you?