Can compiler optimization introduce bugs?

Compiler optimizations can introduce bugs or undesirable behaviour. That’s why you can turn them off.

One example: a compiler can optimize the read/write access to a memory location, doing things like eliminating duplicate reads or duplicate writes, or re-ordering certain operations. If the memory location in question is only used by a single thread and is actually memory, that may be ok. But if the memory location is a hardware device IO register, then re-ordering or eliminating writes may be completely wrong. In this situation you normally have to write code knowing that the compiler might “optimize” it, and thus knowing that the naive approach doesn’t work.

Update: As Adam Robinson pointed out in a comment, the scenario I describe above is more of a programming error than an optimizer error. But the point I was trying to illustrate is that some programs, which are otherwise correct, combined with some optimizations, which otherwise work properly, can introduce bugs in the program when they are combined together. In some cases the language specification says “You must do things this way because these kinds of optimizations may occur and your program will fail”, in which case it’s a bug in the code. But sometimes a compiler has a (usually optional) optimization feature that can generate incorrect code because the compiler is trying too hard to optimize the code or can’t detect that the optimization is inappropriate. In this case the programmer must know when it is safe to turn on the optimization in question.

Another example:
The linux kernel had a bug where a potentially NULL pointer was being dereferenced before a test for that pointer being null. However, in some cases it was possible to map memory to address zero, thus allowing the dereferencing to succeed. The compiler, upon noticing that the pointer was dereferenced, assumed that it couldn’t be NULL, then removed the NULL test later and all the code in that branch. This introduced a security vulnerability into the code, as the function would proceed to use an invalid pointer containing attacker-supplied data. For cases where the pointer was legitimately null and the memory wasn’t mapped to address zero, the kernel would still OOPS as before. So prior to optimization the code contained one bug; after it contained two, and one of them allowed a local root exploit.

CERT has a presentation called “Dangerous Optimizations and the Loss of Causality” by Robert C. Seacord which lists a lot of optimizations that introduce (or expose) bugs in programs. It discusses the various kinds of optimizations that are possible, from “doing what the hardware does” to “trap all possible undefined behaviour” to “do anything that’s not disallowed”.

Some examples of code that’s perfectly fine until an aggressively-optimizing compiler gets its hands on it:

  • Checking for overflow

    // fails because the overflow test gets removed
    if (ptr + len < ptr || ptr + len > max) return EINVAL;
    
  • Using overflow artithmetic at all:

    // The compiler optimizes this to an infinite loop
    for (i = 1; i > 0; i += i) ++j;
    
  • Clearing memory of sensitive information:

    // the compiler can remove these "useless writes"
    memset(password_buffer, 0, sizeof(password_buffer));
    

The problem here is that compilers have, for decades, been less aggressive in optimization, and so generations of C programmers learn and understand things like fixed-size twos complement addition and how it overflows. Then the C language standard is amended by compiler developers, and the subtle rules change, despite the hardware not changing. The C language spec is a contract between the developers and compilers, but the terms of the agreement are subject to change over time and not everyone understands every detail, or agrees that the details are even sensible.

This is why most compilers offer flags to turn off (or turn on) optimizations. Is your program written with the understanding that integers might overflow? Then you should turn off overflow optimizations, because they can introduce bugs. Does your program strictly avoid aliasing pointers? Then you can turn on the optimizations that assume pointers are never aliased. Does your program try to clear memory to avoid leaking information? Oh, in that case you’re out of luck: you either need to turn off dead-code-removal or you need to know, ahead of time, that your compiler is going to eliminate your “dead” code, and use some work-around for it.

Leave a Comment