This optimization has since been implemented in GCC. It can be enabled with the -fno-plt
option and the noplt
function attribute:
Do not use the PLT for external function calls in position-independent code. Instead, load the callee address at call sites from the GOT and branch to it. This leads to more efficient code by eliminating PLT stubs and exposing GOT loads to optimizations. On architectures such as 32-bit x86 where PLT stubs expect the GOT pointer in a specific register, this gives more register allocation freedom to the compiler. Lazy binding requires use of the PLT; with
-fno-plt
all external symbols are resolved at load time.Alternatively, the function attribute
noplt
can be used to avoid calls through the PLT for specific external functions.In position-dependent code, a few targets also convert calls to functions that are marked to not use the PLT to use the GOT instead.