How do I make an infinite empty loop that won’t be optimized away?

The C11 standard says this, 6.8.5/6: An iteration statement whose controlling expression is not a constant expression,156) that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation … Read more

GCC -Wuninitialized / -Wmaybe-uninitialized issues

Indeed this is a known problem in gcc. gcc is notorious for reporting incorrect uninitialized variables. The shortcomings have been duly noted and there is a initiative to overcome the shortcomings: Better Uninitialized Warnings: The GNU Compiler Collection warns about the use of uninitialized variables with the option -Wuninitialized. However, the current implementation has some … Read more

When do programmers use Empty Base Optimization (EBO)

EBO is important in the context of policy based design, where you generally inherit privately from multiple policy classes. If we take the example of a thread safety policy, one could imagine the pseudo-code : class MTSafePolicy { public: void lock() { mutex_.lock(); } void unlock() { mutex_.unlock(); } private: Mutex mutex_; }; class MTUnsafePolicy … Read more

GCC: Difference between -O3 and -Os

The GCC documentation describes what these options do very explicitly. -O3 tries to optimize code very heavily for performance. It includes all of the optimizations -O2 includes, plus some more. -Os, on the other hand, instructs GCC to “optimize for size.” It enables all -O2 optimizations which do not increase the size of the executable, … Read more

No speedup when summing uint16 vs uint64 arrays with NumPy?

TL;DR: I made an experimental analysis on Numpy 1.21.1. Experimental results show that np.sum does NOT (really) make use of SIMD instructions: no SIMD instruction are used for integers, and scalar SIMD instructions are used for floating-point numbers! Moreover, Numpy converts the integers to 64-bits values for smaller integer types by default so to avoid … Read more

How do C compilers implement functions that return large structures?

None; no copies are done. The address of the caller’s Data return value is actually passed as a hidden argument to the function, and the createData function simply writes into the caller’s stack frame. This is known as the named return value optimisation. Also see the c++ faq on this topic. commercial-grade C++ compilers implement … Read more