Merge lists that share common elements

You can see your list as a notation for a Graph, ie ['a','b','c'] is a graph with 3 nodes connected to each other. The problem you are trying to solve is finding connected components in this graph.

You can use NetworkX for this, which has the advantage that it’s pretty much guaranteed to be correct:

l = [['a','b','c'],['b','d','e'],['k'],['o','p'],['e','f'],['p','a'],['d','g']]

import networkx 
from networkx.algorithms.components.connected import connected_components


def to_graph(l):
    G = networkx.Graph()
    for part in l:
        # each sublist is a bunch of nodes
        G.add_nodes_from(part)
        # it also imlies a number of edges:
        G.add_edges_from(to_edges(part))
    return G

def to_edges(l):
    """ 
        treat `l` as a Graph and returns it's edges 
        to_edges(['a','b','c','d']) -> [(a,b), (b,c),(c,d)]
    """
    it = iter(l)
    last = next(it)

    for current in it:
        yield last, current
        last = current    

G = to_graph(l)
print connected_components(G)
# prints [['a', 'c', 'b', 'e', 'd', 'g', 'f', 'o', 'p'], ['k']]

To solve this efficiently yourself you have to convert the list into something graph-ish anyways, so you might as well use networkX from the start.

Leave a Comment