When is an array name or a function name ‘converted’ into a pointer ? (in C)

An expression of array type is implicitly converted to a pointer to the array object’s first element unless it is:

  • The operand of the unary & operator;
  • The operand of sizeof; or
  • A string literal in an initializer used to initialize an array object.

An examples of the third case are:

char arr[6] = "hello";

"hello" is an array expression, of type char[6] (5 plus 1 for the '\0' terminator). It’s not converted to an address; the full 6-byte value of of "hello" is copied into the array object arr.

On the other hand, in this:

char *ptr = "hello";

the array expression "hello" “decays” to a pointer to the 'h', and that pointer value is used to initialize the pointer object ptr. (It should really be const char *ptr, but that’s a side issue.)

An expression of function type (such as a function name) is implicitly converted to a pointer to the function unless it is:

  • The operand of the unary & operator; or
  • The operand of sizeof (sizeof function_name is illegal, not the size of a pointer).

That’s it.

In both cases, no pointer object is created. The expression is converted to (“decays” to) a pointer value, also known as an address.

(The “conversion” in both these cases isn’t an ordinary type conversion like the one specified by a cast operator. It doesn’t take the value of an operand and use it to compute the value of the result, as would happen for an int-to-float conversion. Rather an expression of array or function type is “converted” at compile time to an expression of pointer type. In my opinion the word “adjusted” would have been clearer than “converted”.)

Note that both the array indexing operator [] and the function call “operator” () require a pointer. In an ordinary function call like func(42), the function name func “decays” to a pointer-to-function, which is then used in the call. (This conversion needn’t actually be performed in the generated code, as long as the function call does the right thing.)

The rule for functions has some odd consequences. The expression func is, in most contexts, converted to a pointer to the function func. In &func, func is not converted to a pointer, but & yields the function’s address, i.e., a pointer value. In *func, func is implicitly converted to a pointer, then * dereferences it to yield the function itself, which is then (in most contexts) converted to a pointer. In ****func, this happens repeatedly.

(A draft of the C11 standard says that there’s another exception for arrays, namely when the array is the operand of the new _Alignof operator. This is an error in the draft, corrected in the final published C11 standard; _Alignof can only be applied to a parenthesized type name, not to an expression.)

The address of an array and the address of its first member:

int arr[10];
&arr;    /* address of entire array */
&arr[0]; /* address of first element */

are the same memory address, but they’re of different types. The former is the address of the entire array object, and is of type int(*)[10] (pointer to array of 10 ints); the latter is of type int*. The two types are not compatible (you can’t legally assign an int* value to an int(*)[10] object, for example), and pointer arithmetic behaves differently on them.

There’s a separate rule that says that a declared function parameter of array or function type is adjusted at compile time (not converted) to a pointer parameter. For example:

void func(int arr[]);

is exactly equivalent to

void func(int *arr);

These rules (conversion of array expressions and adjustment of array parameters) combine to create a great deal of confusion regarding the relationship between arrays and pointers in C.

Section 6 of the comp.lang.c FAQ does an excellent job of explaining the details.

The definitive source for this is the ISO C standard. N1570 (1.6 MB PDF) is the latest draft of the 2011 standard; these conversions are specified in section 6.3.2.1, paragraphs 3 (arrays) and 4 (functions). That draft has the erroneous reference to _Alignof, which doesn’t actually apply.

Incidentally, the printf calls in your example are strictly incorrect:

int fruits[10];
printf("Address IN constant pointer is %p\n",fruits);
printf("Address OF constant pointer is %p\n",&fruits); 

The %p format requires an argument of type void*. If pointers of type int* and int(*)[10] have the same representation as void* and are passed as arguments in the same way, as is the case for most implementations, it’s likely to work, but it’s not guaranteed. You should explicitly convert the pointers to void*:

int fruits[10];
printf("Address IN constant pointer is %p\n", (void*)fruits);
printf("Address OF constant pointer is %p\n", (void*)&fruits);

So why is it done this way? The problem is that arrays are in a sense second-class citizens in C. You can’t pass an array by value as an argument in a function call, and you can’t return it as a function result. For arrays to be useful, you need to be able to operate on arrays of different lengths. Separate strlen functions for char[1], for char[2], for char[3], and so forth (all of which are distinct types) would be impossibly unwieldy. So instead arrays are accessed and manipulated via pointers to their elements, with pointer arithmetic providing a way to traverse those elements.

If an array expression didn’t decay to a pointer (in most contexts), then there wouldn’t be much you could do with the result. And C was derived from earlier languages (BCPL and B) that didn’t necessarily even distinguish between arrays and pointers.

Other languages are able to deal with arrays as first-class types but doing so requires extra features that wouldn’t be “in the spirit of C”, which continues to be a relatively low-level language.

I’m less sure about the rationale for treating functions this way. It’s true that there are no values of function type, but the language could have required a function (rather than a pointer-to-function) as the prefix in a function call, requiring an explicit * operator for an indirect call: (*funcptr)(arg). Being able to omit the * is a convenience, but not a tremendous one. It’s probably a combination of historical inertia and consistency with the treatment of arrays.

Leave a Comment