Python combinations without repetitions

I know this is late but I want to add a point.

set(itertools.combinations(t, 4)) would do a fine job for most cases, but it still iterates all repetitive combinations internally and so it can be computationally heavy. This is especially the case if there aren’t many actual unique combinations.

This one iterates only unique combinations:

from itertools import chain, repeat, count, islice
from collections import Counter


def repeat_chain(values, counts):
    return chain.from_iterable(map(repeat, values, counts))


def unique_combinations_from_value_counts(values, counts, r):
    n = len(counts)
    indices = list(islice(repeat_chain(count(), counts), r))
    if len(indices) < r:
        return
    while True:
        yield tuple(values[i] for i in indices)
        for i, j in zip(reversed(range(r)), repeat_chain(reversed(range(n)), reversed(counts))):
            if indices[i] != j:
                break
        else:
            return
        j = indices[i] + 1
        for i, j in zip(range(i, r), repeat_chain(count(j), counts[j:])):
            indices[i] = j


def unique_combinations(iterable, r):
    values, counts = zip(*Counter(iterable).items())
    return unique_combinations_from_value_counts(values, counts, r)

Usage:

>>> list(unique_combinations([2, 2, 2, 2, 4], 4)) # elements must be hashable
[(2, 2, 2, 2), (2, 2, 2, 4)]

# You can pass values and counts separately. For this usage, values don't need to be hashable
# Say you have ['a','b','b','c','c','c'], then since there is 1 of 'a', 2 of 'b', and 3 of 'c', you can do as follows:
>>> list(unique_combinations_from_value_counts(['a', 'b', 'c'], [1, 2, 3], 3))
[('a', 'b', 'b'), ('a', 'b', 'c'), ('a', 'c', 'c'), ('b', 'b', 'c'), ('b', 'c', 'c'), ('c', 'c', 'c')]

# unique_combinations() is a generator (and thus an iterator)
# so you can iterate it
>>> for comb in unique_combinations([2, 2, 2, 2, 4], 4):
...     print(sum(comb))
...
8   # 2+2+2+2
10  # 2+2+2+4

Note that itertools.combinations() is implemented in C, which means it is much faster than my python script for most cases. This code works better than set(itertools.combinations()) method only when there are A LOT MORE repetitive combinations than unique combinations.

Leave a Comment