C++ on x86-64: when are structs/classes passed and returned in registers?

The ABI specification is defined here.
A newer version is available here.

I assume the reader is accustomed to the terminology of the document and that they can classify the primitive types.


If the object size is larger than two eight-bytes, it is passed in memory:

struct foo
{
    unsigned long long a;
    unsigned long long b;
    unsigned long long c;               //Commenting this gives mov rax, rdi
};

unsigned long long foo(struct foo f)
{ 
  return f.a;                           //mov     rax, QWORD PTR [rsp+8]
} 

If it is non POD, it is passed in memory:

struct foo
{
    unsigned long long a;
    foo(const struct foo& rhs){}            //Commenting this gives mov rax, rdi
};

unsigned long long foo(struct foo f)
{
  return f.a;                               //mov     rax, QWORD PTR [rdi]
}

Copy elision is at work here

If it contains unaligned fields, it passed in memory:

struct __attribute__((packed)) foo         //Removing packed gives mov rax, rsi
{
    char b;
    unsigned long long a;
};

unsigned long long foo(struct foo f)
{
  return f.a;                             //mov     rax, QWORD PTR [rsp+9]
}

If none of the above is true, the fields of the object are considered.
If one of the field is itself a struct/class the procedure is recursively applied.
The goal is to classify each of the two eight-bytes (8B) in the object.

The the class of the fields of each 8B are considered.
Note that an integral number of fields always totally occupy one 8B thanks to the alignment requirement of above.

Set C be the class of the 8B and D be the class of the field in consideration class.
Let new_class be pseudo-defined as

cls new_class(cls D, cls C)
{
   if (D == NO_CLASS)
      return C;

   if (D == MEMORY || C == MEMORY)
      return MEMORY;

   if (D == INTEGER || C == INTEGER)
      return INTEGER;

   if (D == X87 || C == X87 || D == X87UP || C == X87UP)
      return MEMORY;

   return SSE;
}

then the class of the 8B is computed as follow

C = NO_CLASS;

for (field f : fields)
{
    D = get_field_class(f);        //Note this may recursively call this proc
    C = new_class(D, C);
}

Once we have the class of each 8Bs, say C1 and C2, than

if (C1 == MEMORY || C2 == MEMORY)
    C1 = C2 = MEMORY;

if (C2 == SSEUP AND C1 != SSE)
   C2 = SSE;

Note This is my interpretation of the algorithm given in the ABI document.


Example

struct foo
{
    unsigned long long a;
    long double b;
};

unsigned long long foo(struct foo f)
{
  return f.a;
}

The 8Bs and their fields

First 8B: a
Second 8B: b

a is INTEGER, so the first 8B is INTEGER.
b is X87 and X87UP so the second 8B is MEMORY.
The final class is MEMORY for both 8Bs.


Example

struct foo
{
    double a;
    long long b;
};

long long foo(struct foo f)
{
  return f.b;                     //mov rax, rdi
}

The 8Bs and their fields

First 8B: a
Second 8B: b

a is SSE, so the first 8B is SSE.
b is INTEGER so the second 8B is INTEGER.

The final classes are the one calculated.


Return values

The values are returned accordingly to their classes:

  • MEMORY
    The caller passes an hidden, first, argument to the function for it to store the result into.
    In C++ this often involves a copy elision/return value optimisation.
    This address must be returned back into eax, thereby returning MEMORY classes “by reference” to an hidden, caller, allocated buffer.

    If the type has class MEMORY, then the caller provides space for the return
    value and passes the address of this storage in %rdi as if it were the first
    argument to the function. In effect, this address becomes a “hidden” first
    argument.
    On return %rax will contain the address that has been passed in by the
    caller in %rdi.

  • INTEGER and POINTER
    The registers rax and rdx as needed.

  • SSE and SSEUP
    The registers xmm0 and xmm1 as needed.

  • X87 AND X87UP
    The register st0


PODs

The technical definition is here.

The definition from the ABI is reported below.

A de/constructor is trivial if it is an implicitly-declared default de/constructor and if:

   • its class has no virtual functions and no virtual base classes, and
   • all the direct base classes of its class have trivial de/constructors, and
   • for all the nonstatic data members of its class that are of class type (or array thereof), each such class has a trivial de/constructor.


Note that each 8B is classified independently so that each one can be passed accordingly.
Particularly, they may end up on the stack if there are no more parameter registers left.

Leave a Comment