Is while(1); undefined behavior in C?

It is well defined behavior. In C11 a new clause 6.8.5 ad 6 has been added

An iteration statement whose controlling expression is not a constant expression,¹⁵⁶⁾ that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.¹⁵⁷⁾

¹⁵⁷⁾_{This is intended to allow compiler transformations such as removal of empty loops even when termination cannot be proven.}

Since the controlling expression of your loop is a constant, the compiler may not assume the loop terminates. This is intended for reactive programs that should run forever, like an operating system.

However for the following loop the behavior is unclear

a = 1; while(a);

In effect a compiler may or may not remove this loop, resulting in a program that may terminate or may not terminate. That is not really undefined, as it is not allowed to erase your hard disk, but it is a construction to avoid.

There is however another snag, consider the following code:

a = 1; while(a) while(1);

Now since the compiler may assume the outer loop terminates, the inner loop should also terminate, how else could the outer loop terminate. So if you have a really smart compiler then a while(1); loop that should not terminate has to have such non-terminating loops around it all the way up to main. If you really want the infinite loop, you’d better read or write some volatile variable in it.

Why this clause is not practical

It is very unlikely our compiler company is ever going to make use of this clause, mainly because it is a very syntactical property. In the intermediate representation (IR), the difference between the constant and the variable in the above examples is easily lost through constant propagation.

The intention of the clause is to allow compiler writers to apply desirable transformations like the following. Consider a not so uncommon loop:

int f(unsigned int n, int *a)
{       unsigned int i;
        int s;
        
        s = 0;
        for (i = 10U; i <= n; i++)
        {
                s += a[i];
        }
        return s;
}

For architectural reasons (for example hardware loops) we would like to transform this code to:

int f(unsigned int n, int *a)
{       unsigned int i;
        int s;
        
        s = 0;
        for (i = 0; i < n-9; i++)
        {
                s += a[i+10];
        }
        return s;
}

Without clause 6.8.5 ad 6 this is not possible, because if n equals UINT_MAX, the loop may not terminate. Nevertheless it is pretty clear to a human that this is not the intention of the writer of this code. Clause 6.8.5 ad 6 now allows this transformation. However the way this is achieved is not very practical for a compiler writer as the syntactical requirement of an infinite loop is hard to maintain on the IR.

Note that it is essential that n and i are unsigned as overflow on signed int gives undefined behavior and thus the transformation can be justified for this reason. Efficient code however benefits from using unsigned, apart from the bigger positive range.

An alternative approach

Our approach would be that the code writer has to express his intention by for example inserting an assert(n < UINT_MAX) before the loop or some Frama-C like guarantee. This way the compiler can “prove” termination and doesn’t have to rely on clause 6.8.5 ad 6.

P.S: I’m looking at a draft of April 12, 2011 as paxdiablo is clearly looking at a different version, maybe his version is newer. In his quote the element of constant expression is not mentioned.

More Related Contents:

Leave a Comment Cancel reply