2D Array indexing – undefined behavior?

It’s undefined behavior, and here’s why.

Multidimensional array access can be broken down into a series of single-dimensional array accesses. In other words, the expression a[i][j] can be thought of as (a[i])[j]. Quoting C11 §6.5.2.1/2:

The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))).

This means the above is identical to *(*(a + i) + j). Following C11 §6.5.6/8 regarding addition of an integer and pointer (emphasis mine):

If both the pointer
operand and the result point to elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined
.

In other words, if a[i] is not a valid index, the behavior is immediately undefined, even if “intuitively” a[i][j] seems in-bounds.

So, in the first case, a[0] is valid, but the following [20] is not, because the type of a[0] is int[5]. Therefore, index 20 is out of bounds.

In the second case, a[-1] is already out-of-bounds, thus already UB.

In the last case, however, the expression a[5] points to one past the last element of the array, which is valid as per §6.5.6/8:

… if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object …

However, later in that same paragraph:

If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

So, while a[5] is a valid pointer, dereferencing it will cause undefined behavior, which is caused by the final [-3] indexing (which, is also out-of-bounds, therefore UB).

Leave a Comment