Generating m distinct random numbers in the range [0..n-1]

Pure mathematics:
Let’s calculate the quantity of rand() function calls in both cases and compare the results:

Case 1:
let’s see the mathematical expectation of calls on step i = k, when you already have k numbers chosen. The probability to get a number with one rand() call is equal to p = (n-k)/n. We need to know the mathematical expectation of such calls quantity which leads to obtaining a number we don’t have yet.

The probability to get it using 1 call is p. Using 2 calls – q * p, where q = 1 - p. In general case, the probability to get it exactly after n calls is (q^(n-1))*p. Thus, the mathematical expectation is
Sum[ n * q^(n-1) * p ], n = 1 --> INF. This sum is equal to 1/p (proved by wolfram alpha).

So, on the step i = k you will perform 1/p = n/(n-k) calls of the rand() function.

Now let’s sum it overall:

Sum[ n/(n - k) ], k = 0 --> m - 1 = n * T – the number of rand calls in method 1.
Here T = Sum[ 1/(n - k) ], k = 0 --> m - 1

Case 2:

Here rand() is called inside random_shuffle n - 1 times (in most implementations).

Now, to choose the method, we have to compare these two values: n * T ? n - 1.
So, to choose the appropriate method, calculate T as described above. If T < (n - 1)/n it’s better to use the first method. Use the second method otherwise.

Leave a Comment