Why does the x86-64 System V calling convention pass args in registers instead of just the stack?

instead of put the first 6 arguments in registers just to move them onto the stack in the function prologue?

I was looking at some code that gcc generated and that’s what it always did.

Then you forgot to enable optimization. gcc -O0 spills everything to memory so you can modify them with a debugger while single-stepping. That’s obviously horrible for performance, so compilers don’t do that unless you force them to by compiling with -O0.

x86-64 System V allows int add(int x, int y) { return x+y; } to compile to
lea eax, [rdi + rsi] / ret, which is what compilers actually do as you can see on the Godbolt compiler explorer.

Stack-args calling conventions are slow and obsolete. RISC machines have been using register-args calling conventions since before x86-64 existed, and on OSes that still care about 32-bit x86 (i.e. Windows), there are better calling conventions like __vectorcall that pass the first 2 integer args in registers.

i386 System V hasn’t been replaced because people mostly don’t care as much about 32-bit performance on other OSes; we just use 64-bit code with the nicely-designed x86-64 System V calling convention.

For more about the tradeoff between register args and call-preserved vs. call-clobbered registers in calling convention design, see Why not store function parameters in XMM vector registers?, and also Why does Windows64 use a different calling convention from all other OSes on x86-64?.

Leave a Comment