What is the most efficient algorithm to find a straight line that goes through most points?

There is likely no solution to this problem that is significantly better than O(n^2) in a standard model of computation.

The problem of finding three collinear points reduces to the problem of finding the line that goes through the most points, and finding three collinear points is 3SUM-hard, meaning that solving it in less than O(n^2) time would be a major theoretical result.

See the previous question on finding three collinear points.

For your reference (using the known proof), suppose we want to answer a 3SUM problem such as finding x, y, z in list X such that x + y + z = 0. If we had a fast algorithm for the collinear point problem, we could use that algorithm to solve the 3SUM problem as follows.

For each x in X, create the point (x, x^3) (for now we assume the elements of X are distinct). Next, check whether there exists three collinear points from among the created points.

To see that this works, note that if x + y + z = 0 then the slope of the line from x to y is

(y^3 – x^3) / (y – x) = y^2 + yx + x^2

and the slope of the line from x to z is

(z^3 – x^3) / (z – x) = z^2 + zx + x^2 = (-(x + y))^2 – (x + y)x + x^2
= x^2 + 2xy + y^2 – x^2 – xy + x^2 = y^2 + yx + x^2

Conversely, if the slope from x to y equals the slope from x to z then

y^2 + yx + x^2 = z^2 + zx + x^2,

which implies that

(y – z) (x + y + z) = 0,

so either y = z or z = -x – y as suffices to prove that the reduction is valid.

If there are duplicates in X, you first check whether x + 2y = 0 for any x and duplicate element y (in linear time using hashing or O(n lg n) time using sorting), and then remove the duplicates before reducing to the collinear point-finding problem.

Leave a Comment