Python: Elegantly merge dictionaries with sum() of values [duplicate]

Doesn’t get simpler than this, I think:

a=[("13.5",100)]
b=[("14.5",100), ("15.5", 100)]
c=[("15.5",100), ("16.5", 100)]
input=[a,b,c]

from collections import Counter

print sum(
    (Counter(dict(x)) for x in input),
    Counter())

Note that Counter (also known as a multiset) is the most natural data structure for your data (a type of set to which elements can belong more than once, or equivalently – a map with semantics Element -> OccurrenceCount. You could have used it in the first place, instead of lists of tuples.


Also possible:

from collections import Counter
from operator import add

print reduce(add, (Counter(dict(x)) for x in input))

Using reduce(add, seq) instead of sum(seq, initialValue) is generally more flexible and allows you to skip passing the redundant initial value.

Note that you could also use operator.and_ to find the intersection of the multisets instead of the sum.


The above variant is terribly slow, because a new Counter is created on every step. Let’s fix that.

We know that Counter+Counter returns a new Counter with merged data. This is OK, but we want to avoid extra creation. Let’s use Counter.update instead:

update(self, iterable=None, **kwds) unbound collections.Counter method

Like dict.update() but add counts instead of replacing them.
Source can be an iterable, a dictionary, or another Counter instance.

That’s what we want. Let’s wrap it with a function compatible with reduce and see what happens.

def updateInPlace(a,b):
    a.update(b)
    return a

print reduce(updateInPlace, (Counter(dict(x)) for x in input))

This is only marginally slower than the OP’s solution.

Benchmark: http://ideone.com/7IzSx (Updated with yet another solution, thanks to astynax)

(Also: If you desperately want an one-liner, you can replace updateInPlace by lambda x,y: x.update(y) or x which works the same way and even proves to be a split second faster, but fails at readability. Don’t :-))

Leave a Comment