How is tuple implemented in CPython?

As a caveat, everything in this answer is based on what I’ve gleaned from looking over the implementation you linked.

It seems that the standard implementation of a tuple is simply as an array. However, there are a bunch of optimizations in place to speed things up.

First, if you try to make an empty tuple, CPython instead will hand back a canonical object representing the empty tuple. As a result, it can save on a bunch of allocations that are just allocating a single object.

Next, to avoid allocating a bunch of small objects, CPython recycles memory for many small lists. There is a fixed constant (PyTuple_MAXSAVESIZE) such that all tuples less than this length are eligible to have their space reclaimed. Whenever an object of length less than this constant is deallocated, there is a chance that the memory associated with it will not be freed and instead will be stored in a “free list” (more on that in the next paragraph) based on its size. That way, if you ever need to allocate a tuple of size n and one has previously been allocated and is no longer in use, CPython can just recycle the old array.

The free list itself is implemented as an array of size PyTuple_MAXSAVESIZE storing pointers to unused tuples, where the nth element of the array points either to NULL (if no extra tuples of size n are available) or to a reclaimed tuple of size n. If there are multiple different tuples of size n that could be reused, they are chained together in a sort of linked list by having each tuple’s zeroth entry point to the next tuple that can be reused. (Since there is only one tuple of length zero ever allocated, there is never a risk of reading a nonexistent zeroth element). In this way, the allocator can store some number of tuples of each size for reuse. To ensure that this doesn’t use too much memory, there is a second constant PyTuple_MAXFREELIST that controls the maximum length of any of these linked lists within any bucket. There is then a secondary array of length PyTuple_MAXSAVESIZE that stores the length of the linked lists for tuples of each given length so that this upper limit isn’t exceeded.

All in all, it’s a very clever implementation!

Leave a Comment