Size of a Python list in memory

Here’s a fuller interactive session that will help me explain what’s going on (Python 2.6 on Windows XP 32-bit, but it doesn’t matter really):

>>> import sys
>>> sys.getsizeof([])
36
>>> sys.getsizeof([1])
40
>>> lst = []
>>> lst.append(1)
>>> sys.getsizeof(lst)
52
>>> 

Note that the empty list is a bit smaller than the one with [1] in it. When an element is appended, however, it grows much larger.

The reason for this is the implementation details in Objects/listobject.c, in the source of CPython.

Empty list

When an empty list [] is created, no space for elements is allocated – this can be seen in PyList_New. 36 bytes is the amount of space required for the list data structure itself on a 32-bit machine.

List with one element

When a list with a single element [1] is created, space for one element is allocated in addition to the memory required by the list data structure itself. Again, this can be found in PyList_New. Given size as argument, it computes:

nbytes = size * sizeof(PyObject *);

And then has:

if (size <= 0)
    op->ob_item = NULL;
else {
    op->ob_item = (PyObject **) PyMem_MALLOC(nbytes);
    if (op->ob_item == NULL) {
        Py_DECREF(op);
        return PyErr_NoMemory();
    }
    memset(op->ob_item, 0, nbytes);
}
Py_SIZE(op) = size;
op->allocated = size;

So we see that with size = 1, space for one pointer is allocated. 4 bytes (on my 32-bit box).

Appending to an empty list

When calling append on an empty list, here’s what happens:

  • PyList_Append calls app1
  • app1 asks for the list’s size (and gets 0 as an answer)
  • app1 then calls list_resize with size+1 (1 in our case)
  • list_resize has an interesting allocation strategy, summarized in this comment from its source.

Here it is:

/* This over-allocates proportional to the list size, making room
* for additional growth.  The over-allocation is mild, but is
* enough to give linear-time amortized behavior over a long
* sequence of appends() in the presence of a poorly-performing
* system realloc().
* The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
*/
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);

/* check for integer overflow */
if (new_allocated > PY_SIZE_MAX - newsize) {
    PyErr_NoMemory();
    return -1;
} else {
    new_allocated += newsize;
}

Let’s do some math

Let’s see how the numbers I quoted in the session in the beginning of my article are reached.

So 36 bytes is the size required by the list data structure itself on 32-bit. With a single element, space is allocated for one pointer, so that’s 4 extra bytes – total 40 bytes. OK so far.

When app1 is called on an empty list, it calls list_resize with size=1. According to the over-allocation algorithm of list_resize, the next largest available size after 1 is 4, so place for 4 pointers will be allocated. 4 * 4 = 16 bytes, and 36 + 16 = 52.

Indeed, everything makes sense 🙂

Leave a Comment