There are at least three ways:
Use dict.setdefault
>>> data = {}
>>> data.setdefault('foo', []).append(42)
>>> data
{'foo': [42]}
Use defaultdict
, which unlike .setdefault
, takes a callable:
>>> from collections import defaultdict
>>> data = defaultdict(list)
>>> data
defaultdict(<class 'list'>, {})
>>> data['foo'].append(42)
>>> data
defaultdict(<class 'list'>, {'foo': [42]})
Finally, subclass dict
and implement __missing__
:
>>> class MyDict(dict):
... def __missing__(self, key):
... self[key] = value = []
... return value
...
>>> data = MyDict()
>>> data['foo'].append(42)
>>> data
{'foo': [42]}
Note, you can think of the last one as the most flexible, you have access to the actual key that’s missing when you deal with it. defaultdict
is a class factory, and it generates a subclass of dict
as well. But, the callable is not passed any arguments, nevertheless, it is sufficient for most needs.
Further, note that the defaultdict
and __missing__
approaches will keep the default behavior, this may be undesirable after you create your data structure, you probably want a KeyError
usually, or at least, you don’t want mydict[key]
to add a key anymore.
In both cases, you can just create a regular dict from the dict subclasses, e.g. dict(data)
. This should generally be very fast, even for large dict
objects, especially if it is a one-time cost. For defaultdict
, you can also set the default_factory
to None
and the old behavior returns:
>>> data = defaultdict(list)
>>> data
defaultdict(<class 'list'>, {})
>>> data['foo']
[]
>>> data['bar']
[]
>>> data
defaultdict(<class 'list'>, {'foo': [], 'bar': []})
>>> data.default_factory = None
>>> data['baz']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'baz'
>>>