Mathematical Foundations of Monte Carlo Methods 4 [ Bingo ]

Mathematical Foundations of Monte Carlo Methods 4

本文为在Scratchapixel上学习的翻译读后感与部分个人解读。这里不会将全篇的内容系数翻译，保留原文以便后期自行理解，笔者只精炼一些文章中关键的点出来便于记录。

Hit-or-Miss Monte Carlo Method

蒙特卡洛方法(Monte Carlo methods):是一个使用随机采样的数值方法来解决数学问题的方法。

原始的蒙特卡洛方法是允许任意分布采样的，所有的非均匀采样的目的都是降低方差提高估计量的有效性也就是重要性采样。

各向同性散射(Isotropic scattering)与各向异性散射(Anistropic scattering):一个光子在进入一个材质时发生了散射，并且其改变后了的方向是随机的，被称为是各项同性；反之，光子改变后了的方向若是只在一个圆锥方向内，那么被称之为各向异性。

Monte Carlo Estimator

对于函数 $f(x)$ 的积分求解，可以用面积法来表达

$F = \int_a^b f(x)\;dx.$

下图中，在函数 $f(x)$ 上随机采样一点 $x$ ，那么结果 $f(x) * (b-a)$ 。但是很明显它和实际收敛结果 $F$ 差距较大。

增加采样数目到4个点，此时对这几个点所求面积进行平均。那么最终的近似结果很明显根据大数定理会不断的逼近真实收敛值。

以下公式很好的表达了这个思想

$\langle F^N\rangle = (b-a) \dfrac{1}{N } \sum_{i=0}^{N-1} f(X_i).$

其中 $\langle F^N\rangle$ 表示了在采样空间 $S$ 中采样N个点之后 $F$ 的近似值，也等价于之前所讲到的样本均值 $\bar X_n$ 。

这里采样点满足均匀分布，即 $pdf(x) = \frac{1}{b-a}$ 。

$\langle F^N \rangle$ 也是随机变量(随机变量的和)，其期望就是 $F$ 本身。

$\begin{array}{l}$

E[\langle F^N \rangle] & = & E \left[ (b-a) \dfrac{1}{N } \sum_{i=0}^{N-1} f(x_i)\right],\\

& = & (b-a)\dfrac{1}{N}\sum_{i=0}^{N-1}E[f(x)],\\

& = &(b-a)\dfrac{1}{N} \sum_{i=0}^{N-1} \int_a^b f(x)pdf(x)\:dx\\

& = & \dfrac{1}{N} \sum_{i=0}^{N-1} \int_a^b f(x)\:dx,\\

&=& \int_a^b f(x)\:dx,\\

&=&F\\

\end{array} $E [⟨ F^{N} ⟩] = = = = = = E [(b - a) N 1 \sum_{i = 0}^{N - 1} f (x_{i})], (b - a) N 1 \sum_{i = 0}^{N - 1} E [f (x)], (b - a) N 1 \sum_{i = 0}^{N - 1} \int_{a}^{b} f (x) p d f (x) d x N 1 \sum_{i = 0}^{N - 1} \int_{a}^{b} f (x) d x, \int_{a}^{b} f (x) d x, F$

原文中公式有误，已纠正求和部分

由于前面选用了均匀分布的 $pdf(x)$ 的缘故，下面推广到任意 $pdf(x)$ 上。下方为通用的蒙特卡洛估计量的写法。

$\langle F^N \rangle = \dfrac{1}{N} \sum_{i=0}^{N-1} \dfrac{f(X_i)}{pdf(X_i)}.$

对其求取期望验证正确性

$\begin{array}{l}$

E[\langle F^N \rangle ] & = & E \left[ \dfrac{1}{N } \sum_{i=0}^{N-1} \dfrac{f(X_i)}{pdf X_i)} \right],\

& = & \dfrac{1}{N} \sum_{i=0}^{N-1} E\left[ \dfrac{f(X_i)}{pdf(X_i) }\right],\

& = & \dfrac{1}{N} \sum_{i=0}^{N-1} \int_\Omega \dfrac{f(x)}{pdf(x)} pdf(x)\;dx, \\

& = & \dfrac{1}{N} \sum_{i=0}^{N-1} \int_\omega f(x) \; dx, \

& = & F.

\end{array} $E [⟨ F^{N} ⟩] = = = = = E [N 1 \sum_{i = 0}^{N - 1} p d f X _{i} ) f ( X _{i} )], N 1 \sum_{i = 0}^{N - 1} E [p d f ( X _{i} ) f ( X _{i} )], N 1 \sum_{i = 0}^{N - 1} \int_{Ω} p d f ( x ) f ( x ) p d f (x) d x, N 1 \sum_{i = 0}^{N - 1} \int_{ω} f (x) d x, F .$

上下限 $(b-a)$ 这样一个积分区间在通用写法中是隐藏的。原因很简单， $(b-a)$ 的产生是因为估计量 $h(x) = \frac{f(x)}{pdf(x)}$ 下均匀分布的 $pdf(x) = \frac{1}{b-a}$ 引起的。

因此原式应该是这样: $\langle F^N \rangle = \frac{1}{N}\sum_{i=0}^{N-1}\frac{f(X_i)}{\frac{1}{b-a}}$ ，而写法 $\langle F^N\rangle = (b-a) \dfrac{1}{N } \sum_{i=0}^{N-1} f(X_i)$ 会更容易能够从图中直接推出。

Properties of Monte Carlo Integration

蒙特卡洛积分估计值会向函数 $f(x)$ 收敛/逼近。 $\text{Pr} \left ( \lim_{N\to\infty} \langle F^N \rangle = F \right ) = 1$
蒙特卡罗估计量是无偏且一致的
收敛的速度和函数的方差 $\sigma^2$ 成比例。估计量本身的方差为 $\frac{\sigma^2}{n}$ ，因此如若需要降低估计值错误为原来的一半，那么需要提高四倍的采样。( $\sigma[\langle F^N \rangle] \propto { 1 \over \sqrt{N} }$ )

无偏:样本均值的期望就是求解积分本身

一致:随着样本容量的增大，估计量愈来愈接近总体参数的真值)

Importance Sampling

重要性采样作为减小方差众多方法中的一个，本身的思想较为直接。

以下为不同采样分布的采样点对于近似值的影响(会高于或者低于真实积分解)。

此图中均匀分布其值勉强，但采样过程中似乎遗漏了函数当中较为重要的部分(一个高峰被忽略)。而右边的人为的采样也并不是一个较好的方法，这将会导致偏差(bias)。

如若被积函数为常数函数，那么采样选用均匀分布得到的结果本身就是正确的。

现有一函数，他与函数 $f(x)$ 成比例

$f(x) = c f'(x)$

因此

$\dfrac{f(x)}{f'(x) } = \dfrac{1}{c}$

那么这里代入蒙特卡洛估计量，联系常数函数的采样的结论。

$\langle F^N \rangle = \dfrac{1}{N} \sum_{i=0}^{N-1} {\dfrac{\color{orange}{f(x)}}{\color{red}{pdf(x)}}} = \dfrac{1}{N}\sum_{i=0}^{N-1}{\dfrac{\color{orange}{f(x)}}{\color{red}{f'(x)}}} = \dfrac{1}{N}\sum_{i=0}^{N-1}{\dfrac{\color{orange}{1}}{\color{red}{c}}}$

也就是说，只要 $pdf(x)$ 与被积函数 $f(x)$ 成比例，蒙特卡洛积分的方差就是0(常数函数的方差为0)。换言之， $pdf(x)$ 与被积函数 $f(x)$ 的相似度越高，那么偏差也就越低。

以 $f(x) = sin(x)$ ，区间 $[0, \frac{\pi}{2}]$ 为例

$\begin{array}{l}$

F & = & \int_0^{\pi \over 2} \sin(x) \; dx \\

& = & \left[ -\cos(x) \right]_0^{\pi \over 2} \\

& = & -\cos(\dfrac{\pi}{2}) - - \cos(0) \\

& = & 1.

\end{array} $F = = = = \int_{0}^{2 π} sin (x) d x [- cos (x)]_{0}^{2 π} - cos (2 π) - - cos (0) 1 .$

选用两个不同的 $pdf(x)$ 进行对比

#	Uniform	Importance	Error Uniform %	Error Importance %
0	1.125890	0.969068	12%	-3%
1	1.277833	0.925675	27%	-7%
2	1.054394	0.980940	5%	-1%
3	1.125890	0.969068	12%	-1%
4	1.125890	0.969068	12%	-6%
5	0.830151	1.041751	-16%	4%
6	1.062268	0.989363	6%	-1%
7	0.849265	1.043809	-15%	4%
8	0.921527	1.020279	-7%	2%
9	1.002310	0.994284	0%	0%

很明显的是，结果非常符合重要性采样理论。

Quasi Monte Carlo

随机采样中无法避免的就是当采样点近乎重合的现象(clump)，这也就意味着在最终计算时，其中一个采样点的信息也就被浪费，这不利于收敛的快速计算。

分层采样(Stratified Sampling): The interval of integration is divided into N subintervals or cells (also often called strata), samples are placed in the middle of these subintervals but are jittered by some negative or positive random offset which can’t be greater than half the width of a cell

换言之就是 $\langle F^N \rangle = { (b-a) \over N} \sum_{i=0}^{N-1} f(a + ( { {i+\xi} \over N } ) (b-a)).$ ，其中 $-h/2 \leq \xi \leq h/2$ 。分层采样的思想介于随机采样和均匀采样之间的。

低差异化序列(Low-Discrepancy Sequences):The goal is to generate sequences of samples which are not exactly uniformly distributed (uniformly distributed samples cause aliasing) and yet which appear to have some regularity in the way they are spaced.

Van der Corput Sequence的简介可见链接，思想就是将整数转换为二进制形式，根据小数点镜像对称。根据 $\phi_b(n) = {d_0 \over {2^1}} + {d_1 \over {2^2}} … + {d_{N-1}\over {2^N}}.$ 将给定的整数转换为小数形式。