compiler-optimization - w3toppers.com

When can Hotspot allocate objects on the stack? [duplicate]

I have done some experimentation in order to see when Hotspot is able to stack allocate. It turns out that its stack allocation is quite a bit more limited than what you might expect based on the available documentation. The referenced paper by Choi “Escape Analysis for Java” suggests that an object that is only … Read more

How do I make an infinite empty loop that won’t be optimized away?

The C11 standard says this, 6.8.5/6: An iteration statement whose controlling expression is not a constant expression,156) that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation … Read more

GCC -Wuninitialized / -Wmaybe-uninitialized issues

Indeed this is a known problem in gcc. gcc is notorious for reporting incorrect uninitialized variables. The shortcomings have been duly noted and there is a initiative to overcome the shortcomings: Better Uninitialized Warnings: The GNU Compiler Collection warns about the use of uninitialized variables with the option -Wuninitialized. However, the current implementation has some … Read more

When do programmers use Empty Base Optimization (EBO)

EBO is important in the context of policy based design, where you generally inherit privately from multiple policy classes. If we take the example of a thread safety policy, one could imagine the pseudo-code : class MTSafePolicy { public: void lock() { mutex_.lock(); } void unlock() { mutex_.unlock(); } private: Mutex mutex_; }; class MTUnsafePolicy … Read more

GCC: Difference between -O3 and -Os

The GCC documentation describes what these options do very explicitly. -O3 tries to optimize code very heavily for performance. It includes all of the optimizations -O2 includes, plus some more. -Os, on the other hand, instructs GCC to “optimize for size.” It enables all -O2 optimizations which do not increase the size of the executable, … Read more

No speedup when summing uint16 vs uint64 arrays with NumPy?

TL;DR: I made an experimental analysis on Numpy 1.21.1. Experimental results show that np.sum does NOT (really) make use of SIMD instructions: no SIMD instruction are used for integers, and scalar SIMD instructions are used for floating-point numbers! Moreover, Numpy converts the integers to 64-bits values for smaller integer types by default so to avoid … Read more

What is the difference between the /Ox and /O2 compiler options?

I found it here: Ox and O2 are almost identical. They differ only in the fact that O2 also throws GF and Gy. There is almost no reason to avoid throwing these two switches.

How do C compilers implement functions that return large structures?

None; no copies are done. The address of the caller’s Data return value is actually passed as a hidden argument to the function, and the createData function simply writes into the caller’s stack frame. This is known as the named return value optimisation. Also see the c++ faq on this topic. commercial-grade C++ compilers implement … Read more

How to optimize these loops (with compiler optimization disabled)?

Re-posting a modified version of my answer from optimized sum of an array of doubles in C, since that question got voted down to -5. The OP of the other question phrased it more as “what else is possible”, so I took him at his word and info-dumped about vectorizing and tuning for current CPU … Read more

Will the jit optimize new objects

Yes, HotSpot JIT can eliminate redundant allocations in a local context. This optimization is provided by the Escape Analysis enabled since JDK 6u23. It is often confused with on-stack allocation, but in fact it is much more powerful, since it allows not only to allocate objects on stack, but to eliminate allocation altogether by replacing … Read more