The Monte Carlo approach [Metropolis and Ulam, 1949] to solve an integral of a function with respect to a given domain is to assign a random variable and pick samples from the given domain such that the expected value would be the approx- imate solution of the integral. Before describing the Monte Carlo integration and other related sampling methods in more detail, the following two sections will in- troduce the preliminary concepts of Random variables and Probability distributions respectively.
2.4.1 Random Variables
A sequence of random events, each having a subsequent numerical (in most cases) outcome and an associated probability of the outcome is collectively called a Monte Carlo sequence. In such a sequence, the summation of the probabilities of all possible values is 1. A variableXwhich takes a finite number of these outcomes is known as a Random variable. A discrete Random variable takes finite number of these outcomes while a continuous Random variable can take any number of possible outcomes often associated with a continuous function.
Expectationor the mean of a discreet Random variableXwith n possible outcomes each having probability p is given by:
E[X] = n X i=1 Xipi [where n X i=1 pi= 1] (2.18)
Varianceof the outcomes from the mean (expectation) of Xis given by: 2 =E[(X E[X])2] =
n
X
i=1
(Xi E[Xi])2pi (2.19)
The variance is the measure of deviation from the mean or in case of Random variables, the measure of deviation from the Expectation E[X]. A process such as Monte Carlo which estimates the value of E(X) has the goal of minimising the variance towards !0.
2.4.2 Probability Distributions
Probability Density Function(PDF), denoted asp(X), can be defined over any arbitrary continuous random event so that the probability of a random outcome (X = x) occurring is p(X = x)dX where R p(X)dX = 1. Here X is a continuous Random variable for the continuous random event over which thep(x) is defined.
Cumulative Distribution Function (CDF) gives the probability of an outcome from a set of random events whose value is less than the given valuey. The CDF for the outcome [X =y], denoted as Pcdf(y) is the probability of all possible outcome
values that are less than equal to the valuey(p(X y)). The CDF of a continuous Random variableXis defined as:
Pcdf(y) =p(Xy) =
Z y
1
Similarly for an interval:
p(aXb) =p(Xb) p(Xa) =
Z b a
p(X)dX =P(b) P(a) (2.21) Similar to the discreet Random variables, the Expectation and Variance for a con- tinuous Random variable (X) are given by:
E[X] = Z Xp(X)dX (2.22) 2 =E[X2] E2[X] = Z X2p(X)dx ( Z Xp(X)dx)2 (2.23)
2.4.3 Monte Carlo Integration
The goal of Monte Carlo technique [Metropolis and Ulam, 1949] is to estimate the value of an integral:
I =
Z b a
f(x)dx (2.24)
wheref(x)is an arbitrary function defined over the domainx2(a, b). By definition, Monte Carlo process takes a finite number of samples off(x),x2(a, b). The selec- tion of these samples is a random process and thus each sample carries a respective probability from a subsequent PDFp(x). For a function G equaling to the sum of Random numbers g(x) weighted with w,
G(x) =
N
X
i
wigi(x) (2.25)
When thewi=wi+1 and Pwi = 1,
G(x) = 1 N N X i gi(x) (2.26)
it can be proven that:
E[G(x)] =E[g(x)] (2.27) 2[G(x)] = 1
N
2[g(x)] (2.28)
Monte Carlo Estimator: As the expectations of G(x) and g(x) are same, G(x) can be used to estimate the E[g(x)]. Moreover, equation 2.28 shows that the variance of G(x) decreases asN increases, thuspN is a factor for accuracy of the estimation of E[g(x)]. Tto solve equation 2.24, let us first assume that f(x) =g(x)p(x) where
p(x) is the probability of selecting the random variable g(x). From equation 2.26 and 2.24, G(x) = 1 N N X i f(x) p(x) =hIi (2.29)
wherehIi is the Monte Carlo estimator of I. Any integral in the form of equation 2.24 estimated by Monte Carlo technique has its estimator in the form ofhIi. It can be further shown that E[hIi] =I and the variance of the estimatorhIi is:
2 = 1 N Z (f(x) p(x) I) 2p(x)dx (2.30) This indicates that the variance 2 decreases as N increases monotonically. The standard deviation error is thus inversely proportional topN. Thus, decreasing the error by half would require quadruple the number of samples. Solving an integral with Monte Carlo techniques consist of three steps; i.e. Taking N samples over a probability distribution, actually solving the function at that sample and then then take the mean of these solutions. Next sections will briefly introduce the different types of sampling procedures over a probability distribution.
2.4.4 Importance Sampling
Solving the integral I = Rabf(x)dx with the estimator E[hIi] requires the PDF p(x) in accordance to which all the samples are drawn via any of the Inverse CDF, Cosine lobe or Rejection sampling methods. The optimal PDF p(x) can be found by minimising the variance 2. it can be shown that p(x) = p1 |f(x)| where is a scalar constant. This effectively shows that p(x) will be optimal when it is in the shape off(x).
This points to two different cases to solve a Monte Carlo integration where either there is no information about the f(x) i.e. blind Monte Carlo; and where there is some knowledge of the f(x) also known as informed Monte Carlo. While there is no way to optimisep(x) in case of blind Monte Carlo due to the absence of information aboutf(x), thep(x)can be shaped to resemblef(x)in case of informed Monte Carlo with whatever information available on it.
2.4.5 Stratified Sampling
Clearly the Importance sampling does not work very well in case of blind Monte Carlo because of the often inefficient sample distributions. Moreover, even for an efficient PDFp(x), the samples can be badly distributed inside the domain for a small number of samples resulting in “sample clumping”. Although increasing the number
of samples will eventually mitigate this problem but it is significantly inefficient for faster, real-time solution requirements.
Stratified sampling solves the clamping problem by dividing the domain (⌦) into a finite number of “strata” (!0,!1, ...,!N) and then evaluate each of the strata
to find the solution of I.
Z ⌦ f(x)dx= N X i=1 Z !i !i 1 f(x)dx (2.31)
It can be further shown that with same sized strata and one sample each stratum can make the Variance ( 2) of stratified sampling less than simple, blind Monte Carlo. However, the N depends on the size of the strata compared to the size of the domain. Monte Carlo integration and the relevant sampling strategies discussed in this chapter are crucial to solve the Fredholm integral [Fredholm, 1903] for light transport, i.e. the rendering equation that has been already discussed in Section 2.3. The next section will describe a few relevant Global Illumination (GI) algorithms which make use of these methods to solve the rendering equation and render photorealistic images.