Store the largest 5000 numbers from a stream of numbers

The simplest solution for this is maintaining a min heap of max size 5000.

  • Every time a new number arrives – check if the heap is smaller then
    5000, if it is – add it.
  • If it is not – check if the minimum is smaller then the new
    element, and if it is, pop it out and insert the new element instead.
  • When you are done – you have a heap containing 5000 largest elements.

This solution is O(nlogk) complexity, where n is the number of elements and k is the number of elements you need (5000 in your case).

It can be done also in O(n) using selection algorithm – store all the elements, and then find the 5001th largest element, and return everything bigger than it. But it is harder to implement and for reasonable size input – might not be better. Also, if stream contains duplicates, more processing is needed.

Leave a Comment