Bare arrays as function args in C and C++ always decay to pointers, just like in several other contexts.
Arrays inside struct
s or union
s don’t, and are passed by value. This is why ABIs need to care about how they’re passed, even though it doesn’t happen in C for bare arrays.
As Keith Thomson points out, the relevant part of the C standard is N1570 section 6.7.6.3 paragraph 7
A declaration of a parameter as “array of type” shall be adjusted to
“qualified pointer to type”, where the type qualifiers (if any) are
those specified within the [ and ] of the array type derivation … (stuff aboutfoo[static 10]
, see below)
Note that multidimensional arrays work as arrays of array type, so only the outer-most level of “array-ness” is converted to a pointer to array type.
Terminology: The x86-64 ABI doc uses the same terminology as ARM, where struct
s and arrays are “aggregates” (multiple elements at sequential addresses). So the phrase “aggregates and unions” comes up a lot, because union
s are handled similarly by the language and the ABI.
It’s the recursive rule for handling composite types (struct/union/class) that brings the array-passing rules in the ABI into play. This is the only way you’ll see asm that copies an array to the stack as part of a function arg, for C or C++
struct s { int a[8]; };
void ext(struct s byval);
void foo() { struct s tmp = {{0}}; ext(tmp); }
gcc6.1 compiles it (for the AMD64 SysV ABI, with -O3
) to the following:
sub rsp, 40 # align the stack and leave room for `tmp` even though it's never stored?
push 0
push 0
push 0
push 0
call ext
add rsp, 72
ret
In the x86-64 ABI, pass-by-value happens by actual copying (into registers or the stack), not by hidden pointers.
Note that return-by-value does pass a pointer as a “hidden” first arg (in rdi
), when the return value is too large to fit in the 128bit concatenation of rdx:rax
(and isn’t a vector being returned in vector regs, etc. etc.)
It would be possible for the ABI to use a hidden pointer to pass-by-value objects above a certain size, and trust the called function not to modify the original, but that’s not what the x86-64 ABI chooses to do. That would be better in some cases (especially for inefficient C++ with lots of copying without modification (i.e. wasted)), but worse in other cases.
SysV ABI bonus reading: As the x86 tag wiki points out, the current version of the ABI standard doesn’t fully document the behaviour that compilers rely on: clang/gcc sign/zero extend narrow args to 32bit.
Note that to really guarantee that a function arg is a fixed-size array, C99 and later lets you use the static
keyword in a new way: on array sizes. (It’s still passed as a pointer, of course. This doesn’t change the ABI).
void bar(int arr[static 10]);
This allows compiler warnings about going out of bounds. It also potentially enables better optimization if the compiler knows it’s allowed to access elements that the C source doesn’t. (See this blog post). However, the arg still has type int*
, not an actual array, so sizeof(arr) == sizeof(int*)
.
The same keyword page for C++ indicates that ISO C++ does not support this usage of static
; it’s another one of those C-only features, along with C99 variable-length-arrays and a few other goodies that C++ doesn’t have.
In C++, you can use std::array<int,10>
to get compile-time size information passed to the caller. However, you have to manually pass it by reference if that’s what you want, since it’s of course just a class containing an int arr[10]
. Unlike a C-style array, it doesn’t decay to T*
automatically.
The ARM doc that you linked doesn’t seem to actually call arrays an aggregate type: Section 4.3 Composite Types (which discusses alignment) distinguishes arrays from aggregate types, even though they appear to be a special case of its definition for aggregates.
A Composite Type is a collection of one or more Fundamental Data Types that are handled as a single entity at the
procedure call level. A Composite Type can be any of:
- An aggregate, where the members are laid out sequentially in memory
- A union, where each of the members has the same address
- An array, which is a repeated sequence of some other type (its base type).
The definitions are recursive; that is, each of the types may contain a Composite Type as a member
“Composite” is an umbrella term that includes arrays, structs, and unions.