Given parallel lists, how can I sort one while permuting (rearranging) the other in the same way?

One classic approach to this problem is to use the “decorate, sort, undecorate” idiom, which is especially simple using python’s built-in zip function:

>>> list1 = [3,2,4,1, 1]
>>> list2 = ['three', 'two', 'four', 'one', 'one2']
>>> list1, list2 = zip(*sorted(zip(list1, list2)))
>>> list1
(1, 1, 2, 3, 4)
>>> list2 
('one', 'one2', 'two', 'three', 'four')

These of course are no longer lists, but that’s easily remedied, if it matters:

>>> list1, list2 = (list(t) for t in zip(*sorted(zip(list1, list2))))
>>> list1
[1, 1, 2, 3, 4]
>>> list2
['one', 'one2', 'two', 'three', 'four']

It’s worth noting that the above may sacrifice speed for terseness; the in-place version, which takes up 3 lines, is a tad faster on my machine for small lists:

>>> %timeit zip(*sorted(zip(list1, list2)))
100000 loops, best of 3: 3.3 us per loop
>>> %timeit tups = zip(list1, list2); tups.sort(); zip(*tups)
100000 loops, best of 3: 2.84 us per loop

On the other hand, for larger lists, the one-line version could be faster:

>>> %timeit zip(*sorted(zip(list1, list2)))
100 loops, best of 3: 8.09 ms per loop
>>> %timeit tups = zip(list1, list2); tups.sort(); zip(*tups)
100 loops, best of 3: 8.51 ms per loop

As Quantum7 points out, JSF’s suggestion is a bit faster still, but it will probably only ever be a little bit faster, because Python uses the very same DSU idiom internally for all key-based sorts. It’s just happening a little closer to the bare metal. (This shows just how well optimized the zip routines are!)

I think the zip-based approach is more flexible and is a little more readable, so I prefer it.


Note that when elements of list1 are equal, this approach will end up comparing elements of list2. If elements of list2 don’t support comparison, or don’t produce a boolean when compared (for example, if list2 is a list of NumPy arrays), this will fail, and if elements of list2 are very expensive to compare, it might be better to avoid comparison anyway.

In that case, you can sort indices as suggested in jfs’s answer, or you can give the sort a key function that avoids comparing elements of list2:

result1, result2 = zip(*sorted(zip(list1, list2), key=lambda x: x[0]))

Also, the use of zip(*...) as a transpose fails when the input is empty. If your inputs might be empty, you will have to handle that case separately.

Leave a Comment