Here is what the gcc manual has to say about it:
-mpush-args
-mno-push-args
Use PUSH operations to store outgoing parameters. This method is shorter and usually
equally fast as method using SUB/MOV operations and is enabled by default.
In some cases disabling it may improve performance because of improved scheduling
and reduced dependencies.
-maccumulate-outgoing-args
If enabled, the maximum amount of space required for outgoing arguments will be
computed in the function prologue. This is faster on most modern CPUs because of
reduced dependencies, improved scheduling and reduced stack usage when preferred
stack boundary is not equal to 2. The drawback is a notable increase in code size.
This switch implies -mno-push-args.
Apparently -maccumulate-outgoing-args
is enabled by default, overriding -mpush-args
. Explicitly compiling with -mno-accumulate-outgoing-args
does revert to the PUSH
method, here.
2019 update: modern CPUs have had efficient push/pop since about Pentium M.
-mno-accumulate-outgoing-args
(and using push) eventually became the default for -mtune=generic
in Jan 2014.