Why do arrays in C decay to pointers?

Rationale

Let’s examine function calls because the problems are nicely visible there: Why are arrays not simply passed to functions as arrays, by value, as a copy?

There is first a purely pragmatic reason: Arrays can be big; it may not be advisable to pass them by value because they
could exceed the stack size, especially in the 1970s. The first compilers were written on a PDP-7 with about 9 kB RAM.

There is also a more technical reason rooted in the language. It would be hard to generate code for a function call with arguments whose size is not known at compile time. For all arrays, including variable length arrays in modern C, simply the addresses are put on the call stack. The size of an address is of course well known. Even languages with elaborate array types carrying run time size information do not pass the objects proper on the stack. These languages typically pass “handles” around, which is what C has effectively done, too, for 40 years. See Jon Skeet here and an illustrated explanation he references (sic) here.

Now a language could make it a requirement that an array always have a complete type; i.e. whenever it is used, its complete declaration including the size must be visible. This is, after all, what C requires from structures (when they are accessed). Consequently, structures can be passed to functions by value. Requiring the complete type for arrays as well would make function calls easily compilable and obviate the need to pass additional length arguments: sizeof() would still work as expected inside the callee. But imagine what that means. If the size were really part of the array’s argument type, we would need a distinct function for each array size:

// for user input.
int average_ten(int arr[10]);

// for my new Hasselblad.
int average_twohundredfivemilliononehundredfourtyfivethousandsixhundred(int arr[16544*12400]);
// ...

In fact it would be totally comparable to passing structures, which differ in type if their elements differ (say, one struct with 10 int elements and one with 16544*12400). It is obvious that arrays need more flexibility. For example, as demonstrated one could not sensibly provide generally usable library functions which take array arguments.

This “strong typing conundrum” is, in fact, what happens in C++ when a function takes a reference to an array; that is also the reason why nobody does it, at least not explicitly. It is totally inconvenient to the point of being useless except for cases which target specific uses, and in generic code: C++ templates provide compile-time flexibility which is not available in C.

If, in existing C, indeed arrays of known sizes should be passed by value there is always the possibility to wrap them in a struct. I remember that some IP related headers on Solaris defined address family structures with arrays in them, allowing to copy them around. Because the byte layout of the struct was fixed and known, that made sense.

For some background it’s also interesting to read The Development of the C Language by Dennis Ritchie about the origins of C. C’s predecessor BCPL didn’t have any arrays; the memory was just homogeneous linear memory with pointers into it.

More Related Contents:

Leave a Comment Cancel reply