Generating all Possible Combinations

Sure thing. It is a bit tricky to do this with LINQ but certainly possible using only the standard query operators.

UPDATE: This is the subject of my blog on Monday June 28th 2010; thanks for the great question. Also, a commenter on my blog noted that there is an even more elegant query than the one I gave. I’ll update the code here to use it.

The tricky part is to make the Cartesian product of arbitrarily many sequences. “Zipping” in the letters is trivial compared to that. You should study this to make sure that you understand how it works. Each part is simple enough but the way they are combined together takes some getting used to:

static IEnumerable<IEnumerable<T>> CartesianProduct<T>(this IEnumerable<IEnumerable<T>> sequences)
{
    IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>()};
    return sequences.Aggregate(
        emptyProduct,
        (accumulator, sequence) => 
            from accseq in accumulator 
            from item in sequence 
            select accseq.Concat(new[] {item})                          
        );
 }

To explain how this works, first understand what the “accumulate” operation is doing. The simplest accumulate operation is “add everything in this sequence together”. The way you do that is: start with zero. For each item in the sequence, the current value of the accumulator is equal to the sum of the item and previous value of the accumulator. We’re doing the same thing, except that instead of accumulating the sum based on the sum so far and the current item, we’re accumulating the Cartesian product as we go.

The way we’re going to do that is to take advantage of the fact that we already have an operator in LINQ that computes the Cartesian product of two things:

from x in xs
from y in ys
do something with each possible (x, y)

By repeatedly taking the Cartesian product of the accumulator with the next item in the input sequence and doing a little pasting together of the results, we can generate the Cartesian product as we go.

So think about the value of the accumulator. For illustrative purposes I’m going to show the value of the accumulator as the results of the sequence operators it contains. That is not what the accumulator actually contains. What the accumulator actually contains is the operators that produce these results. The whole operation here just builds up a massive tree of sequence operators, the result of which is the Cartesian product. But the final Cartesian product itself is not actually computed until the query is executed. For illustrative purposes I’ll show what the results are at each stage of the way but remember, this actually contains the operators that produce those results.

Suppose we are taking the Cartesian product of the sequence of sequences {{1, 2}, {3, 4}, {5, 6}}. The accumulator starts off as a sequence containing one empty sequence: { { } }

On the first accumulation, accumulator is { { } } and item is {1, 2}. We do this:

from accseq in accumulator
from item in sequence 
select accseq.Concat(new[] {item})

So we are taking the Cartesian product of { { } } with {1, 2}, and for each pair, we concatenate: We have the pair ({ }, 1), so we concatenate { } and {1} to get {1}. We have the pair ({ }, 2}), so we concatenate { } and {2} to get {2}. Therefore we have {{1}, {2}} as the result.

So on the second accumulation, accumulator is {{1}, {2}} and item is {3, 4}. Again, we compute the Cartesian product of these two sequences to get:

 {({1}, 3), ({1}, 4), ({2}, 3), ({2}, 4)}

and then from those items, concatenate the second one onto the first. So the result is the sequence {{1, 3}, {1, 4}, {2, 3}, {2, 4}}, which is what we want.

Now we accumulate again. We take the Cartesian product of the accumulator with {5, 6} to get

 {({ 1, 3}, 5), ({1, 3}, 6), ({1, 4}, 5), ...

and then concatenate the second item onto the first to get:

{{1, 3, 5}, {1, 3, 6}, {1, 4, 5}, {1, 4, 6} ... }

and we’re done. We’ve accumulated the Cartesian product.

Now that we have a utility function that can take the Cartesian product of arbitrarily many sequences, the rest is easy by comparison:

var arr1 = new[] {"a", "b", "c"};
var arr2 = new[] { 3, 2, 4 };
var result = from cpLine in CartesianProduct(
                 from count in arr2 select Enumerable.Range(1, count)) 
             select cpLine.Zip(arr1, (x1, x2) => x2 + x1);

And now we have a sequence of sequences of strings, one sequence of strings per line:

foreach (var line in result)
{
    foreach (var s in line)
        Console.Write(s);
    Console.WriteLine();
}

Easy peasy!

Leave a Comment