“enter” vs “push ebp; mov ebp, esp; sub esp, imm” and “leave” vs “mov esp, ebp; pop ebp”

There is a performance difference, especially for enter. On modern processors this decodes to some 10 to 20 µops, while the three instruction sequence is about 4 to 6, depending on the architecture. For details consult Agner Fog’s instruction tables.

Additionally the enter instruction usually has a quite high latency, for example 8 clocks on a core2, compared to the 3 clocks dependency chain of the three instruction sequence.

Furthermore the three instruction sequence may be spread out by the compiler for scheduling purposes, depending on the surrounding code of course, to allow more parallel execution of instructions.

Leave a Comment