Can an array be grouped more efficiently than sorted?

Since you asked about comparison-based methods, I’m going to make the usual assumptions that (1) elements can be compared but not hashed (2) the only resource of interest is three-way operations.

In an absolute sense, it’s easier to group than to sort. Here’s a grouping algorithm for three elements that uses one comparison (sorting requires three). Given an input x, y, z, if x = y, then return x, y, z. Otherwise, return x, z, y.

Asymptotically, however, both grouping and sorting require Omega(n log n) comparisons. The lower bound technique is information-theoretic: we prove that, for every grouping algorithm expressed as a decision tree, there are 3^Omega(n log n) leaves, which implies that the height of the tree (and hence the worst-case running time of the algorithm) is Omega(n log n).

Fix an arbitrary leaf of the decision tree where no input elements are found to be equal. The input positions are partially ordered by the inequalities found.

Suppose to the contrary that i, j, k are pairwise incomparable input positions. Letting x = input[i], y = input[j], z = input[k], the possibilities x = y < z and y = z < x and z = x < y are all consistent with what the algorithm has observed. This cannot be, since it is impossible for the one order chosen by the leaf to put x next to y next to z next to x. We conclude that the partial order has no antichain of cardinality three.

By Dilworth’s theorem, the partial order has two chains that cover the whole input. By considering all possible ways to merge these chains into a total order, there are at most n choose m ≤ 2^n permutations that map to each leaf. The number of leaves is thus at least n!/2^n = 3^Omega(n log n).

Leave a Comment