x86 - w3toppers.com

how does push and pop work in assembly

The latter POP EBP is equivalent to MOV EBP, [ESP] ADD ESP, 4 ; but without modifying flags, like LEA ESP, [ESP+4] (in Intel syntax – target on the left, source on the right)

Why is the Carry Flag set during a subtraction when zero is the minuend?

Carry flag is carry or borrow out of the Most Significant bit (MSb): CF (bit 0) Carry flag — Set if an arithmetic operation generates a carry or a borrow out of the mostsignificant bit of the result; cleared otherwise. This flag indicates an overflow condition for unsigned-integer arithmetic. It is also used in multiple-precision … Read more

Memory alignment on a 32-bit Intel processor

The usual rule of thumb (straight from Intels and AMD’s optimization manuals) is that every data type should be aligned by its own size. An int32 should be aligned on a 32-bit boundary, an int64 on a 64-bit boundary, and so on. A char will fit just fine anywhere. Another rule of thumb is, of … Read more

inlining failed in call to always_inline ‘_mm_mullo_epi32’: target specific option mismatch

A general method to find the instruction switch for gcc File intrin.sh: #!/bin/bash get_instruction () { [ -z “$1″ ] && exit func_name=”$1 ” header_file=`grep –include=\*intrin.h -Rl “$func_name” /usr/lib/gcc | head -n1` [ -z “$header_file” ] && exit >&2 echo “find in: $header_file” target_directive=`grep “#pragma GCC target(\|$func_name” $header_file | grep -B 1 “$func_name” | head … Read more

Fastest inline-assembly spinlock

Although there is already an accepted answer, there are a few things that where missed that could be used to improve all the answers, taken from this Intel article, all above fast lock implementation: Spin on a volatile read, not an atomic instruction, this avoids unneeded bus locking, especially on highly contended locks. Use back-off … Read more

x86 Assembly pushl/popl don’t work with “Error: suffix or operands invalid”

In 64-bit mode you cannot push and pop 32-bit values; you need pushq and popq. Also, you will not get a proper exit this way. On 32-bit x86, you would need to set %eax to 1 to select the exit() system call, and set %ebx to the exit code you actually wish. On 64-bit x86 … Read more

What is register %eiz?

See Why Does GCC LEA EIZ?: Apparently %eiz is a pseudo-register that just evaluates to zero at all times (like r0 on MIPS). … I eventually found a mailing list post by binutils guru Ian Lance Taylor that reveals the answer. Sometimes GCC inserts NOP instructions into the code stream to ensure proper alignment and … Read more

Write system call won’t print characters from a register

A quick fix of your code: push 0x41414141 ; put ‘AAAA’ into stack memory mov ecx,esp ; pointer to the ‘AAAA’ mov eax, 4 ; write is syscall 4 for 32-bit Linux mov ebx, 1 ; stdout mov edx, 4 int 0x80 add esp,4 ; restore stack No explanation, as you should first check what … Read more

Assembly bubble sort swap

I think I’d use pointers into the current position into the list, instead of an index that needs to be scaled every time you use it: mov esi, offset list top: mov edi, esi inner: mov eax, [edi] mov edx, [edi+4] cmp eax, edx jle no_swap mov [edi+4], eax mov [edi], edx no_swap: add edi, … Read more

Combine 32- and 64bit DLLs in one program

On 64-bit Windows 64-bit processes can not use 32-bit DLLs and 32-bit processes can’t use 64-bit DLLs. Microsoft has documented this: On 64-bit Windows, a 64-bit process cannot load a 32-bit dynamic-link library (DLL). Additionally, a 32-bit process cannot load a 64-bit DLL. You would need a 32-bit process that communicates with the 32-bit DLL … Read more