Class 0x0A: Monte Carlo integration and simulation

Simple Monte Carlo integration

Suppose we have some function f({x}) of the n-dimensional parameter {x}. Pick N random points {x}_i, uniformly distributed in volume V. Then

I_\text{MC} \equiv \frac{V}{N} \sum_{i=1}^N f({x}_i) = V <f>

is itself a random number, with mean and standard deviation

\overline{I_\text{MC}} = \int f dV, \qquad {\large \sigma}_{[I_{\text{MC}}]} = \sqrt{\overline{I_\text{MC}^2}-\overline{I_\text{MC}}^2} \approx \frac{V}{\sqrt{N}} \sqrt{<f^2>-<f>^2}.

(Here angle brackets are used to denote the mean over the samples, and overline is used to denote the true mean in the limit of infinite statistics.)

In plain English, I_{MC} is an estimate for the integral.

Integration over many dimensions

Integration over complicated volumes

If the volume V is hard to sample randomly (strange shape), just define a function that is zero outside the volume and equal to f inside, and integrate that over an easy-to-sample volume.

3d view of a complicated solid, looking like spiral, lumpy worm. There is text at the top saying "((x-cos(z)/cosh(z))/0.25)^2+((y-sin(z)/cosh(z))/(0.2+0.05*sin(4*z)))^2-1+(z/6)^2"

A complicated solid with fine-grained features that would be difficult to integrate analytically or on a numerical grid.

How to get uniform random numbers

Short answer: use a function that returns a (pseudo) random number uniform in the range [0,1), available from many libraries.

It used to be considered necessary to talk about the pros and cons of various random number generators (RNGs) a great deal, because bad RNGs were commonplace, good ones were hard to find, and computers were so slow it was worth considering using a RNG with poor randomness if it was faster.

The latter two factors are generally no longer true, so a little advice should suffice for the moment.

See also: GSL manual section General comments on random numbers.

Example: Mass of an ellipsoidal cloud

The interior of an ellipsoid is defined by (x/a)^2 + (y/b)^2 + (z/c)^2 < 1. Let the density function be \rho(x,y,z).

Algorithm:
initialize sum_rho1 = 0, sum_rho2 = 0
loop N times:
pick uniform x in -a<x<a, uniform y in -b<y<b, uniform z in -c<z<c
if (x/a)^2 + (y/b)^2 + (z/c)^2 < 1 :
rho = \rho(x,y,z)
sum_rho1 += rho
sum_rho2 += rho * rho
V = 8*a*b*c
mean = sum_rho1/N
mean2 = sum_rho2/N
integral = V*mean
errorest = V*sqrt((mean2 - mean*mean)/N)

Weighting techniques (change of variable)

Example:

Example: Mass of an ellipsoidal cloud (revisited)

The interior of an ellipsoid is defined by (x/a)^2 + (y/b)^2 + (z/c)^2 < 1. Let the density function be \rho(x,y,z)=f(x,y) e^{\alpha s}.

Algorithm:
initialize sum_1 = 0, sum_2 = 0
s_1=e^{-\alpha c}/\alpha, s_2=e^{\alpha c}/\alpha
loop N times:
pick uniform x in -a<x<a, uniform y in -b<y<b, uniform s in s_1<s<s_2
z=\log(\alpha s)/\alpha
if (x/a)^2 + (y/b)^2 + (z/c)^2 < 1 :
f = f(x,y)
sum_1 += f
sum_2 += f * f
V = 4 a b (s_2-s_1)
mean = sum_1/N
mean2 = sum_2/N
integral = V*mean
errorest = V*sqrt((mean2 - mean*mean)/N)

Simulation of "unknown" probability distributions

Sometimes you know everything about a random process, except it's too hard to evaluate the resulting probability distribution analytically.

Example: a game of snakes and ladders: you know all the rules, but what is the probability distribution of the number of moves in a game?

One can simulate the process using random numbers and keep track of the distributions of the quantities of interest.

This turns out to be mathematically equivalent to evaluating probability integrals using the Monte Carlo method, where the n dimensions correspond to all the random variables involved. It may be that n is very very large.

Example: Variability of time to sort card deck

If you have a randomly sorted deck of 52 cards, how many steps does it take to sort them using a particular sorting algorithm? Find mean and standard deviation.

Algorithm to find mean and std. dev. of sorting times:
initialize sum1 = 0, sum2 = 0
loop N times:
set up array of integers 1 to 52
shuffle it
call sort algorithm (which must tell you the number of steps, nsteps)
sum1 += nsteps
sum2 += nsteps*nsteps
mean = sum1/N
mean2 = sum2/N
stddev = sqrt((mean2 - mean*mean)/(N-1))

Storing distributions in histograms

Figure showing a histogram

Simple 1-d histogram class

Note: ROOT, GSL, many other packages provide histogram functions, but just to show how easy it is:

class SimpleHisto1 {
 private:
  vector<int> bins;   // stores histogram contents
  double xlow;        // lower end of histogram range
  double xhigh;       // upper end of histogram range

 public:

  // set or reset the histogram range and number of bins
  void Initialize(int nbin, double xlow_, double xhigh_) {
    bins.clear();
    bins.resize(nbin);
    xlow= xlow_;
    xhigh= xhigh_;
  }

  // add 1 to the appropriate bin
  int Fill(double x) {
    if (x<xlow || x>=xhigh)
      return 0;
    int ibin= int( (x-xlow)/(xhigh-xlow)*bins.size() );
    bins[ibin] += 1;
    return 1;
  }

  // get contents of bin i
  int BinContents(int i) {
    return bins[i];
  }
};

Example: time to sort card deck (revisited)

If you have a randomly sorted deck of 52 cards, how many steps does it take to sort them using a particular sorting algorithm? Find the full probability distribution.

Algorithm to find distribution of sorting times:
initialize histogram h
loop N times:
set up array of integers 1 to 52
shuffle it
call sort algorithm (which must tell you the number of steps)
h.Fill(nsteps)
print histogram contents

Simulation of known random distributions

So far, we've used a RNG which gives a uniform random number u in the range [0,1).

Given such a number u, it's easy to get a number x that is uniform in [a,b): x=a+(b-a)u.

Likewise, it's easy to get an integer i in the range 0..(n-1): i=\text{int}(u*n).

And it's easy to get a boolean b with probability p for true, 1-p for false: b= (u<p).

Suppose you need a random number with an exponential probability distribution, or some other distribution?

Inverse distribution function method

Given a uniform random number u in the range [0,1), we have

\frac{dP}{du} = 1, \qquad  P_u(u)=u.

Here P_u is the integral of the probability density, called the "cumulative distribution function" or simply the "distribution function". [*] It gives the probability that a given random number will be less than or equal to a given number.

Suppose we want a non-uniform random number x, with properties

\frac{dP}{dx} = f(x), \qquad  P_x(x)= \int_{-\infty}^{x} f(x') dx'.

We can obtain that as follows:

x = P_x^{-1}(u).

The proof is simple and pleasant, so I leave it as an exercise to the student.

[*]Also called the "probability distribution function" in some texts, although this terminology is easily confused with the probability density function (p.d.f.), especially since the density function when drawn actually has the shape of the distribution of the probability.

Example: exponential distribution

The distribution function for the exponential distribution is easily inverted:

\frac{dP}{dx}=e^{-x}, \quad P_x(x) = 1-e^{-x}
P_x^{-1}(u) = -\log(1-u)

Thus, x=-\log(1-u) is an exponential random number with the desired probability density. This can be verified by calculating dP/dx = (dP/du)\cdot(du/dx).

Rejection method

Another way to obtain dP/dx=f(x), less efficient, is to generate x uniformly in x_\text{min} < x < x_\text{max}, then generate a second random number p in [0,1). If p < f(x)/f_{max}, where f_{max} is the largest value f can have, then accept x, otherwise generate a new x and p and try again.

Figure illustrating rejection method, showing shaded region for values less than f, unshaded for greater values.

Pairs of (x,p) random variables are rejected if outside the shaded region.

Some pitfalls in using RNGs

Assignment: Random variable resistor network

Suppose I set up 3 variable 1-kohm resistors connected to big knobs in the hallway with a big sign saying "DO NOT ADJUST". Every time a student passes, s/he adjusts the knobs to a random place, uniformly between minimum and maximum. Make the histogram of the total resistor network resistance if one knob is in parallel with two knobs in series, as shown in the figure.

Schematic diagram showing two variable resistors in series, in parallel with a single variable resistor.

Schematic diagram of the "random knobs".