You can detect it by simply running two pointers through the list, this process is known as the tortoise and hare algorithm after the fable of the same name:
- First off, check if the list is empty (
head
isnull
). If so, no cycle exists, so stop now. - Otherwise, start the first pointer
tortoise
on the first nodehead
, and the second pointerhare
on the second nodehead.next
. - Then loop continuously until
hare
isnull
(which may be already true in a one-element list), advancingtortoise
by one andhare
by two in each iteration. The hare is guaranteed to reach the end first (if there is an end) since it started ahead and runs faster. - If there is no end (i.e., if there is a cycle), they will eventually point to the same node and you can stop, knowing you have found a node somewhere within the cycle.
Consider the following loop which starts at 3
:
head -> 1 -> 2 -> 3 -> 4 -> 5
^ |
| V
8 <- 7 <- 6
Starting tortoise
at 1 and hare
at 2, they take on the following values:
(tortoise,hare) = (1,2) (2,4) (3,6) (4,8) (5,4) (6,6)
Because they become equal at (6,6)
, and since hare
should always be beyond tortoise
in a non-looping list, it means you’ve discovered a cycle.
The pseudo-code will go something like this:
def hasLoop (head):
return false if head = null # Empty list has no loop.
tortoise = head # tortoise initially first element.
hare = tortoise.next # Set hare to second element.
while hare != null: # Go until hare reaches end.
return false if hare.next = null # Check enough left for hare move.
hare = hare.next.next # Move hare forward two.
tortoise = tortoise.next # Move tortoise forward one.
return true if hare = tortoise # Same means loop found.
endwhile
return false # Loop exit means no loop.
enddef
The time complexity for this algorithm is O(n)
since the number of nodes visited (by tortoise and hare) is proportional to the number of nodes.
Once you know a node within the loop, there’s also an O(n)
guaranteed method to find the start of the loop.
Let’s return to the original position after you’ve found an element somewhere in the loop but you’re not sure where the start of the loop is.
head -> 1 -> 2 -> 3 -> 4 -> 5
^ |
| V
8 <- 7 <- 6
\
x (where hare and tortoise met).
This is the process to follow:
- Advance
hare
and setsize
to1
. - Then, as long as
hare
andtortoise
are different, continue to advancehare
, increasingsize
each time. This eventually gives the size of the cycle, six in this case. - At this point, if
size
is1
, that means you must already be at the start of the cycle (in a cycle of size one, there is only one possible node that can be in the cycle so it must be the first one). In this case, you simply returnhare
as the start, and skip the rest of the steps below. - Otherwise, set both
hare
andtortoise
to the first element of the list and advancehare
exactlysize
times (to the7
in this case). This gives two pointers that are different by exactly the size of the cycle. - Then, as long as
hare
andtortoise
are different, advance them both together (with the hare running at a more sedate pace, the same speed as the tortoise – I guess it’s tired from its first run). Since they will remain exactlysize
elements apart from each other at all times,tortoise
will reach the start of the cycle at exactly the same time ashare
returns to the start of the cycle.
You can see that with the following walkthrough:
size tortoise hare comment
---- -------- ---- -------
6 1 1 initial state
7 advance hare by six
2 8 1/7 different, so advance both together
3 3 2/8 different, so advance both together
3/3 same, so exit loop
Hence 3
is the start point of the cycle and, since both those operations (the cycle detection and cycle start discovery) are O(n)
and performed sequentially, the whole thing taken together is also O(n)
.
If you want a more formal proof that this works, you can examine the following resources:
- a question on our sister site;
- the Wikipedia cycle detection page; or
- “The Tortoise and the Hare Algorithm” by Peter Gammie, April 17, 2016.
If you’re simply after support for the method (not formal proof), you can run the following Python 3 program which evaluates its workability for a large number of sizes (how many elements in the cycle) and lead-ins (elements before the cycle start).
You’ll find it always finds a point where the two pointers meet:
def nextp(p, ld, sz):
if p == ld + sz:
return ld
return p + 1
for size in range(1,1001):
for lead in range(1001):
p1 = 0
p2 = 0
while True:
p1 = nextp(p1, lead, size)
p2 = nextp(nextp(p2, lead, size), lead, size)
if p1 == p2:
print("sz = %d, ld = %d, found = %d" % (size, lead, p1))
break