Generate N random numbers within a range with a constant sum

In case you want the sample to follow a uniform distribution, the problem reduces to generate N random numbers with sum = 1. This, in turn, is a special case of the Dirichlet distribution but can also be computed more easily using the Exponential distribution. Here is how:

Take a uniform sample v₁ … v_N with all v_i between 0 and 1.
For all i, 1<=i<=N, define u_i := -ln v_i (notice that u_i > 0).
Normalize the u_i as p_i := u_i/s where s is the sum u₁+…+u_N.

The p₁..p_N are uniformly distributed (in the simplex of dim N-1) and their sum is 1.

You can now multiply these p_i by the constant C you want and translate them by summing some other constant A like this

q_i := A + p_i*C.

EDIT 3

In order to address some issues raised in the comments, let me add the following:

To ensure that the final random sequence falls in the interval [a,b] choose the constants A and C above as A := a and C := b-a, i.e., take q_i = a + p_i*(b-a). Since p_i is in the range (0,1) all q_i will be in the range [a,b].
One cannot take the (negative) logarithm -ln(v_i) if v_i happens to be 0 because ln() is not defined at 0. The probability of such an event is extremely low. However, in order to ensure that no error is signaled the generation of v₁ … v_N in item 1 above must treat any occurrence of 0 in a special way: consider -ln(0) as +infinity (remember: ln(x) -> -infinity when x->0). Thus the sum s = +infinity, which means that p_i = 1 and all other p_j = 0. Without this convention the sequence (0…1…0) would never be generated (many thanks to @Severin Pappadeux for this interesting remark.)
As explained in the 4th comment attached to the question by @Neil Slater it is logically impossible to fulfill all the requirements of the original framing. Therefore any solution must relax the constraints to a proper subset of the original ones. Other comments by @Behrooz seem to confirm that this would suffice in this case.

EDIT 2

One more issue has been raised in the comments:

Why rescaling a uniform sample does not suffice?

In other words, why should I bother to take negative logarithms?

The reason is that if we just rescale then the resulting sample won’t distribute uniformly across the segment (0,1) (or [a,b] for the final sample.)

To visualize this let’s think 2D, i.e., let’s consider the case N=2. A uniform sample (v₁,v₂) corresponds to a random point in the square with origin (0,0) and corner (1,1). Now, when we normalize such a point dividing it by the sum s=v₁+v₂ what we are doing is projecting the point onto the diagonal as shown in the picture (keep in mind that the diagonal is the line x + y = 1):

enter image description here

But given that green lines, which are closer to the principal diagonal from (0,0) to (1,1), are longer than orange ones, which are closer to the axes x and y, the projections tend to accumulate more around the center of the projection line (in blue), where the scaled sample lives. This shows that a simple scaling won’t produce a uniform sample on the depicted diagonal. On the other hand, it can be proven mathematically that the negative logarithms do produce the desired uniformity. So, instead of copypasting a mathematical proof I would invite everyone to implement both algorithms and check that the resulting plots behave as this answer describes.

(Note: here is a blog post on this interesting subject with an application to the Oil & Gas industry)

More Related Contents:

Leave a Comment Cancel reply