How does __builtin___clear_cache work?

It is just emitting some weird machine instruction[s] on target processors requiring them (x86 don’t need that).

Think of __builtin___clear_cache as a “portable” (to GCC and compatible compilers) way to flush the instruction cache (e.g. in some JIT library).

In practice, how could one find what the right begin and end to use?

To be safe, I would use that on some page range (e.g. obtained with sysconf(_SC_PAGESIZE)….), so usually a 4Kbyte aligned memory range (multiple of 4Kbyte). Otherwise, you want some target specific trick to find the cache line width…

On Linux, you might read /proc/cpuinfo and use the cache_alignment & cache_size lines to get a more precise cache line size and alignment.

BTW, a code using __builtin__clear_cache is very likely to be (for other reasons) target machine specific, so it has or knows some machine parameters (and that should include cache size & alignment).

Leave a Comment