x64 nasm: pushing memory addresses onto the stack & call function

The 64-bit OS X ABI complies at large to the System V ABI – AMD64 Architecture Processor Supplement. Its code model is very similar to the Small position independent code model (PIC) with the differences explained here. In that code model all local and small data is accessed directly using RIP-relative addressing. As noted in the comments by Z boson, the image base for 64-bit Mach-O executables is beyond the first 4 GiB of the virtual address space, therefore push msg is not only an invalid way to put the address of msg on the stack, but it is also an impossible one since PUSH does not support 64-bit immediate values. The code should rather look similar to:

   ; this is what you *would* do for later args on the stack
lea   rax, [rel msg]  ; RIP-relative addressing
push  rax

But in that particular case one needs not push the value on the stack at all. The 64-bit calling convention mandates that the fist 6 integer/pointer arguments are passed in registers RDI, RSI, RDX, RCX, R8, and R9, exactly in that order. The first 8 floating-point or vector arguments go into XMM0, XMM1, …, XMM7. Only after all the available registers are used or there are arguments that cannot fit in any of those registers (e.g. a 80-bit long double value) the stack is used. 64-bit immediate pushes are performed using MOV (the QWORD variant) and not PUSH. Simple return values are passed back in the RAX register. The caller must also provide stack space for the callee to save some of the registers.

printf is a special function because it takes variable number of arguments. When calling such functions AL (the low byte of RAX) should be set to the number of floating-point arguments, passed in the vector registers. Also note that RIP-relative addressing is preferred for data that lies within 2 GiB of the code.

Here is how gcc translates printf("This is a test\n"); into assembly on OS X:

    xorl    %eax, %eax             # (1)
    leaq    L_.str(%rip), %rdi     # (2)
    callq   _printf                # (3)

L_.str:
    .asciz   "This is a test\n"

(this is AT&T style assembly, source is left, destination is right, register names are prefixed with %, data width is encoded as a suffix to the instruction name)

At (1) zero is put into AL (by zeroing the whole RAX which avoids partial-register delays) since no floating-point arguments are being passed. At (2) the address of the string is loaded in RDI. Note how the value is actually an offset from the current value of RIP. Since the assembler doesn’t know what this value would be, it puts a relocation request in the object file. The linker then sees the relocation and puts the correct value at link time.

I am not a NASM guru, but I think the following code should do it:

default rel             ; make [rel msg] the default for [msg]
section .data
    msg:  db 'This is a test', 10, 0    ; something stupid here

section .text
    global _main
    extern _printf

_main:
    push    rbp                 ; re-aligns the stack by 16 before call
    mov     rbp, rsp       

    xor     eax, eax            ; al = 0 FP args in XMM regs
    lea     rdi, [rel msg]
    call    _printf

    mov     rsp, rbp
    pop     rbp
    ret

Leave a Comment