Python struct.pack() behavior

From [Python 2.Docs]: struct – Interpret bytes as packed binary data:

This module performs conversions between Python values and C structs represented as Python strings.

This means that it will print the memory representation of the argument(s) as char sequences. Memory (and everything that resides in it) is a sequence of bytes. Each byte has a value [0..255] (for simplicity’s sake I use unsigned).
So, when it will represent a byte, it will first search for a char having the ASCII code matching the byte value, and if such a (printable) char is found, it will be the representation of that byte, otherwise the representation will be the byte value (in hex) preceded by \x (convention for representing non printable chars). As a side note, (non extended) ASCII chars have values between 0 and 128.

Example:

  • A byte value of 65 (hex 0x41) will be represented as ‘A‘ (as A‘s ASCII code is 65)

  • A byte value of 217 (hex 0xd9) will be simply represented as ‘\xd9‘ (there’s no printable char with this ASCII code)

Before going further, a few words are needed about endianness: that is the way how data (numbers in our case) is represented in computer memory. A couple of links (although many resources can be found on the internet):

I’ll try to briefly explain the difference between big and little endian (again, for simplicity’s sake I’ll stick with the 8 bit atomic element size only).

Imagine we’re doing some memory representation exercises on a piece of paper, or better: on a blackboard. If we were to represent the blackboard as the computer memory, then the upper left corner would be its beginning (address 0) and the addresses would increase as we go to the right (and also down below to the next line when we reach the right edge).
We want to represent the number 0x12345678 as a 4 byte number, starting from the upper left corner (each byte consists of exactly 2 hex digits):

╔═══════════╦══════════╦══════════╦══════════╦══════════╗
║   Byte    ║    01    ║    02    ║    03    ║    04    ║
╠═══════════╬══════════╬══════════╬══════════╬══════════╣
║   Value   ║   0x12   ║   0x34   ║   0x56   ║   0x78   ║
╚═══════════╩══════════╩══════════╩══════════╩══════════╝

Our number’s most significant byte is stored at the lowest memory address (and the least significant byte is stored at the highest), which is big endian. For little endian, our number bytes are in reversed order.

As a conclusion, humans think “big endianly”.

Another topic that I want to cover is: types (int to be more precise). Python, being C based, inherits its native types, so an int will have 4 bytes (on some platforms / architectures it might have 8). So, an int (again, talking about unsigned) has a value [0..4294967295]. But even for a smaller value: 5 for example (which only requires 1 byte), it will still occupy 4 bytes: the (most significant) unused bytes will be padded with 0s. So, our number as a 4 byte unsigned int will be (hex): 0x00000005.

Now, back to our problem(s): as stated above, 5 is 0x05 (or 0x000000054 byte unsigned int) or in chars: “\x00\x00\x00\x05“. But it’s in reversed order than what struct.pack displays; I think you already guessed why: it’s in little endian representation. That is given by the 1st (fmt) argument (“<” part to be more precise) given to [Python 2.Docs]: struct.pack(fmt, v1, v2, …) (possible values are listed on the same page: [Python 2.Docs]: struct – Byte Order, Size, and Alignment).
For 55555, things are just the same. Its hex representation is: 0xd903 or 0x0000d903.

If it doesn’t make sense yet, take this slightly modified version of your code and play with it, by entering different values for data_set and see the outputs:

code.py:

import struct
fmt = "<L"
data_set = [5, 55555, 0x12345678]

for data in data_set:
    output_str = "{} - {}".format(hex(data), repr(struct.pack(fmt, data)).strip("'"))  # This is just for formatting output string to be displayed to the user
    print(output_str)  # Python3 compatible (however the formatting above won't behave nicely)

Output:

c:\Work\Dev\StackOverflow\q037990060>"C:\Install\x64\HPE\OPSWpython\2.7.10__00\python.exe" "code.py"
0x5 - \x05\x00\x00\x00
0xd903 - \x03\xd9\x00\x00
0x12345678 - xV4\x12

Leave a Comment