What is data alignment? Why and when should I be worried when typecasting pointers in C? [duplicate]

I’ll try to explain in short.

What is data alignment?

The architecture in you computer is composed of processor and memory.
Memory is organized in cells, so:

 0x00 |   data  |  
 0x01 |   ...   |
 0x02 |   ...   |

Each memory cell has a specified size, amount of bits it can store. This is architecture dependent.

When you define a variable in your C/C++ program, one or more different cells are occupied by your program.

For example

int variable = 12;

Suppose each cell contains 32 bits and the int type size is 32 bits, then in somewhere in your memory:

variable: | 0 0 0 c |  // c is hexadecimal of 12.

When your CPU has to operate on that variable it needs to bring it inside its register. A CPU can take in “1 clock” a small amount of bit from the memory, that size is usually called WORD. This dimension is architecture dependent as well.

Now suppose you have a variable which is stored, because of some offset, in two cells.

For example I have two different pieces data to store (I’m going to use a “string representation to make more clear“):

data1: "ab"
data2: "cdef"

So the memory will be composed in that way (2 different cells):

|a b c d|     |e f 0 0|

That is, data1 occupies just half of the cell, so data2 occupies the remaining part and a part of a second cell.

Now suppose you CPU wants to read data2. The CPU needs 2 clocks in order to access the data, because within one clock it reads the first cell and within the other clock it reads the remaining part in the second cell.

If we align data2 in accordance with this memory-example, we can
introduce a sort of padding and shift data2 all in the second cell.

|a b 0 0|     |c d e f|
     ---
   padding

In that way the CPU will lose only “1 clock” in order to access to data2.

What an align system does

An align system just introduces that padding in order to align the data with the memory of the system, in accordance with the architecture.

Why should I care about alignment?

I will not go deep in this answer.
However, broadly speaking, memory alignment comes from the requirements of the context.

In the example above, having padding (so the data is memory-aligned) can save CPU cycles in order to retrieve the data. This might have an impact on the execution performance of the program because of minor number of memory access.

However, beyond the above example (made only for sake of the explanation), there are many other scenarios where memory alignment is useful or even needed.

For example, some architectures might have strict requirements how the memory can be accessed. In such cases, the padding helps to allocate memory fulfilling the platform constraints.

Leave a Comment