assembly - w3toppers.com

How to force GCC to assume that a floating-point expression is non-negative?

You can write assert(x*x >= 0.f) as a compile-time promise instead of a runtime check as follows in GNU C: #include <cmath> float test1 (float x) { float tmp = x*x; if (!(tmp >= 0.0f)) __builtin_unreachable(); return std::sqrt(tmp); } (related: What optimizations does __builtin_unreachable facilitate? You could also wrap if(!x)__builtin_unreachable() in a macro and call … Read more

How do I compile the asm generated by GCC?

Yes, You can use gcc to compile your asm code. Use -c for compilation like this: gcc -c file.S -o file.o This will give object code file named file.o. To invoke linker perform following after above command: gcc file.o -o file

How to break on assembly instruction at a given address in gdb?

try break *0x0000000000400448

Write system call won’t print characters from a register

A quick fix of your code: push 0x41414141 ; put ‘AAAA’ into stack memory mov ecx,esp ; pointer to the ‘AAAA’ mov eax, 4 ; write is syscall 4 for 32-bit Linux mov ebx, 1 ; stdout mov edx, 4 int 0x80 add esp,4 ; restore stack No explanation, as you should first check what … Read more

Assembly bubble sort swap

I think I’d use pointers into the current position into the list, instead of an index that needs to be scaled every time you use it: mov esi, offset list top: mov edi, esi inner: mov eax, [edi] mov edx, [edi+4] cmp eax, edx jle no_swap mov [edi+4], eax mov [edi], edx no_swap: add edi, … Read more

Why is this C++ wrapper class not being inlined away?

It is inlined, but not optimized away because you compiled with -O0 (the default). That generates asm for consistent debugging, allowing you to modify any C++ variable while stopped at a breakpoint on any line. This means the compiler spills everything from registers after every statement, and reloads what it needs for the next. So … Read more

Create an arg array for execve on the stack

You can put the argv array onto the stack and load the address of it into rsi. The first member of argv is a pointer to the program name, so we can use the same address that we load into rdi. xor edx, edx ; Load NULL to be used both as the third ; … Read more

Is processor can do memory and arithmetic operation at the same time?

You’re right, a modern x86 will decode add dword [mem], 1 to 3 uops: a load, an ALU add, and a store. (This is actually a simplification of various things, including Intel’s micro-fusion and how AMD always keeps a load+ALU together in some parts of the pipeline…) Those 3 dependent operations can’t happen at the … Read more

Why is the “start small” algorithm for branch displacement not optimal?

Here’s a proof that, in the absence of the anomalous jumps mentioned by harold in the comments, the “start small” algorithm is optimal: First, let’s establish that “start small” always produces a feasible solution — that is, one that doesn’t contain any short encoding of a too-long jump. The algorithm essentially amounts to repeatedly asking … Read more

Shadow space example

The shadow space must be provided directly previous to the call. Imagine the shadow space as a relic from the old stdcall/cdecl convention: For WriteFile you needed five pushes. The shadow space stands for the last four pushes (the first four arguments). Now you need four registers, the shadow space (just the space, contents don’t … Read more