When should I use size directives in x86?

You’re right; it is rather ambiguous. Assuming we’re talking about Intel syntax, it is true that you can often get away with not using size directives. Any time the assembler can figure it out automatically, they are optional. For example, in the instruction

mov    esi, DWORD PTR [rax*4+0x419260]

the DWORD PTR specifier is optional for exactly the reason you suppose: the assembler can figure out that it is to move a DWORD-sized value, since the value is being moved into a DWORD-sized register.

Similarly, in

mov    rsi, QWORD PTR [rax*4+0x419260]

the QWORD PTR specifier is optional for the exact same reason.

But it is not always optional. Consider your first example:

mov    QWORD PTR [rip+0x21b520], 0x1

Here, the QWORD PTR specifier is not optional. Without it, the assembler has no idea what size value you want to store starting at the address rip+0x21b520. Should 0x1 be stored as a BYTE? Extended to a WORD? A DWORD? A QWORD? Some assemblers might guess, but you can’t be assured of the correct result without explicitly specifying what you want.

In other words, when the value is in a register operand, the size specifier is optional because the assembler can figure out the size based on the size of the register. However, if you’re dealing with an immediate value or a memory operand, the size specifier is probably required to ensure you get the results you want.

Personally, I prefer to always include the size when I write code. It’s a couple of characters more typing, but it forces me to think about it and state explicitly what I want. If I screw up and code a mismatch, then the assembler will scream loudly at me, which has caught bugs more than once. I also think having it there enhances readability. So here I agree with old_timer, even though his perspective appears to be somewhat unpopular.

Disassemblers also tend to be verbose in their outputs, including the size specifiers even when they are optional. Hans Passant theorized in the comments this was to preserve backwards-compatibility with old-school assemblers that always needed these, but I’m not sure that’s true. It might be part of it, but in my experience, disassemblers tend to be wordy in lots of different ways, and I think this is just to make it easier to analyze code with which you are unfamiliar.

Note that AT&T syntax uses a slightly different tact. Rather than writing the size as a prefix to the operand, it adds a suffix to the instruction mnemonic: b for byte, w for word, l for dword, and q for qword. So, the three previous examples become:

movl    0x419260(,%rax,4), %esi
movq    0x419260(,%rax,4), %rsi
movq    $0x1, 0x21b520(%rip)

Again, on the first two instructions, the l and q prefixes are optional, because the assembler can deduce the appropriate size. On the last instruction, just like in Intel syntax, the prefix is non-optional. So, the same thing in AT&T syntax as Intel syntax, just a different format for the size specifiers.

More Related Contents:

Leave a Comment Cancel reply