World's most popular travel blog for travel bloggers.

Uniform sampling from a simplex

, , No Comments
Problem Detail: 

I am looking for an algorithm to generate an array of N random numbers, such that the sum of the N numbers is 1, and all numbers lie within 0 and 1. For example, N=3, the random point (x, y, z) should lie within the triangle:

x + y + z = 1 0 < x < 1 0 < y < 1 0 < z < 1 

Ideally I want each point within the area to have equal probability. If it's too hard, I can drop the requirement. Thanks.

Asked By : Ruofeng

Answered By : A.Schulz

Let us first assume that you want to sample within

x + y + z = 1 0 ≤ x ≤ 1 0 ≤ y ≤ 1 0 ≤ z ≤ 1 

This doesn't make quite a difference, since the sample point will still lie in your requested area with high probability.

Now you are left with sampling a point from a simplex. In the 3d example you get a 2d simplex (triangle) realized in 3d.

How to pick a point uniformly at random was discussed in this blog post (see the comments).

For your problem it would mean that you take $n-1$ random numbers from the interval $(0,1)$, then you add a $0$ and $1$ to get a list of $n+1$ numbers. You sort the list and then you record the differences between two consecutive elements. This gives you a list of $n$ number that will sum up to $1$. Moreover this sampling is uniform. This idea can be found in Donald B. Rubin, The Bayesian bootstrap Ann. Statist. 9, 1981, 130-134.

For example ($n=4$) you have the three random numbers 0.4 0.2 0.1 then you obtain the sorted sequence 0 0.1 0.2 0.4 1 and this gives the differences 0.1 0.1 0.2 0.6, and by construction these four numbers sum up to 1.

Another approach is the following: first sample from the hypercube (that is you forget about x+y+z=1) and then normalize the sample point. The normalization is a projection from the $d$-hypercube to the $d-1$-simplex. It should be intuitively clear that the points at the center of the simplex have more "pre-image-points" than at the outside. Hence, if you sample uniformly from the hypercube, this wont give you a uniform sampling in the simplex. However, if you sample from the hypercube with an appropriate Exponential Distribution, than this effect cancels out. The Figure gives you an idea how both methods will sample. However, I prefer the "sorting" method due to its simple form. It's also easier to implement.

Example of the 2 sampling methods

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/3227

0 comments:

Post a Comment

Let us know your responses and feedback