How unique is Python’s id()?

Yes, CPython re-uses id() values. Do not count on these being unique in a Python program.

This is clearly documented:

Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.

Bold emphasis mine. The id is unique only as long as an object is alive. Objects that have no references left to them are removed from memory, allowing the id() value to be re-used for another object, hence the non-overlapping lifetimes wording.

Note that this applies to CPython only, the standard implementation provided by python.org. There are other Python implementations, such as IronPython, Jython and PyPy, that make their own choices about how to implement id(), because they each can make distinct choices on how to handle memory and object lifetimes.

To address your specific questions:

  1. In CPython, id() is the memory address. New objects will be slotted into the next available memory space, so if a specific memory address has enough space to hold the next new object, the memory address will be reused. You can see this in the interpreter when creating new objects that are the same size:

    >>> id(1234)
    4546982768
    >>> id(4321)
    4546982768
    

    The 1234 literal creates a new integer object, for which id() produces a numeric value. As there are no further references to the int value, it is removed from memory again. But executing the same expression again with a different integer literal, and chances are you’ll see the same id() value (a garbage collection run breaking cyclic references could free up more memory, so you could also not see the same id() again.

    So it’s not random, but in CPython it is a function of the memory allocation algorithms.

  2. If you need to check specific objects, keep your own reference to it. That can be a weakref weak reference if all you need to assure is that the object is still ‘alive’.

    For example, recording an object reference first, then later checking it:

    import weakref
    
    # record
    object_ref = weakref.ref(some_object)
    
    # check if it's the same object still
    some_other_reference is object_ref()   # only true if they are the same object
    

    The weak reference won’t keep the object alive, but if it is alive then the object_ref() will return it (it’ll return None otherwise).

    You could use such a mechanism to generate really unique identifiers, see below.

  3. All you have to do to ‘destroy’ an object is to remove all references to it. Variables (local and global) are references. So are attributes on other objects, and entries in containers such as lists, tuples, dictionaries, sets, etc.

    The moment all references to an object are gone, the reference count on the object drops to 0 and it is deleted, there and then.

    Garbage collection only is needed to break cyclic references, objects that reference one another only, with no further references to the cycle. Because such a cycle will never reach a reference count of 0 without help, the garbage collector periodically checks for such cycles and breaks one of the references to help clear those objects from memory.

    So you can cause any object to be deleted from memory (freed), by removing all references to it. How you achieve that depends on how the object is referenced. You can ask the interpreter to tell you what objects are referencing a given object with the gc.get_referrers() function, but take into account that doesn’t give you variable names. It gives you objects, such as the dictionary object that is the __dict__ attribute of a module that references the object as a global, etc. For code fully under your control, at most use gc.get_referrers() as a tool to remind yourself what places the object is referenced from as you write the code to remove those.

If you must have unique identifiers for the lifetime of the Python application, you’d have to implement your own facility. If your objects are hashable and support weak references, then you could just use a WeakKeyDictionary instance to associate arbitrary objects with UUIDs:

from weakref import WeakKeyDictionary
from collections import defaultdict
from uuid import uuid4

class UniqueIdMap(WeakKeyDictionary):
    def __init__(self, dict=None):
        super().__init__(self)
        # replace data with a defaultdict to generate uuids
        self.data = defaultdict(uuid4)
        if dict is not None:
            self.update(dict)

uniqueidmap = UniqueIdMap()

def uniqueid(obj):
    """Produce a unique integer id for the object.

    Object must me *hashable*. Id is a UUID and should be unique
    across Python invocations.

    """
    return uniqueidmap[obj].int

This still produces integers, but as they are UUIDs they are not quite guaranteed to be unique, but the likelihood you’ll ever encounter the same ID during your lifetime are smaller than being hit by a meteorite. See How unique is UUID?

This then gives you unique ids even for objects with non-overlapping lifetimes:

>>> class Foo:
...     pass
...
>>> id(Foo())
4547149104
>>> id(Foo())  # memory address reused
4547149104
>>> uniqueid(Foo())
151797163173960170410969562162860139237
>>> uniqueid(Foo())  # but you still get a unique UUID
188632072566395632221804340107821543671

Leave a Comment