9/21/2009

[機率論] 淺論 弱大數法則

以下我們討論一些關於 弱大數法則(Weak Law of Large Numbers, WLLN) 的結果,首先介紹 一組隨機變數 數列 的 機率收斂  (Convergence in Probability)

=============================
Definition: Convergence in Probability
令 $Y_n$ 為一組隨機變數 sequence,我們說 $Y_n$ converges to $Y$ in probability 若下列條件成立:對任意 $\varepsilon >0$
\[
P(|Y_n - Y| > \varepsilon) \rightarrow 0 \;\; \text{ as  $n \rightarrow \infty$}
\]=============================

Comments:
1. 上述定義等價為
\[
P(|Y_n - Y| \leq \varepsilon) \rightarrow 1 \;\; \text{ as  $n \rightarrow \infty$}
\]
2. 上述定義中 $Y_n \to^P Y$ 的 $Y$ 可為隨機變數或者為常數。
3. 機率收斂在 機率論與隨機過程,以及 統計理論中 扮演重要角色,比如機率收斂在統計中等價稱為 consistent estimator,在此不做贅述。




==============================
Definition: Uncorrelated Random Variables
接著再回憶我們說一組隨機變數 $X_i, \; i \in \mathbb{N}$ 且 $E [X_i^2] <\infty$ 為 uncorrelated 若下列條件成立:當 $i \neq j$
\[
E[X_i X_j] = E[X_i] E[X_j]
\]============================

現在我們看個 uncorrelated 隨機變數的結果

=============================
Theorem:
令 $X_1, X_2,...X_n$ 為 uncorrelated 且 $E[X_i^2] < \infty$ 則
\[
var(X_1 + ... + X_n ) = var(X_1) + ... + var(X_n)
\]其中 $var(X)$ 為 variance of $X$。
=============================

Proof: omitted.


=============================
Lemma:
若 $p >0$ 且 $E[|X_n|^p] \rightarrow 0$ 則 $X_n \rightarrow 0$ in probability。
=============================

Proof:
要證明 $X_n \rightarrow 0$ in probability,亦即
\[P(|{X_n} - 0| > \varepsilon ) \to 0
\]首先觀察下列事件等價
\[\begin{array}{l}
\left\{ {|{X_n}| > \varepsilon } \right\} = \left\{ {|{X_n}{|^p} > {\varepsilon ^p}} \right\}\\
 \Rightarrow P\left\{ {|{X_n}| > \varepsilon } \right\} = P\left\{ {|{X_n}{|^p} > {\varepsilon ^p}} \right\}
\end{array}
\]由 Markov inequality 可知
\[P\left\{ {|{X_n}| > \varepsilon } \right\} = P\left\{ {|{X_n}{|^p} > {\varepsilon ^p}} \right\} \le \frac{{E\left[ {|{X_n}{|^p}} \right]}}{{{\varepsilon ^p}}}
\]由於 $E[|X_n|^p] \rightarrow 0$ ,且 $\varepsilon^p $ 為定值,故
\[P\left\{ {|{X_n}| > \varepsilon } \right\} \le \frac{{E\left[ {|{X_n}{|^p}} \right]}}{{{\varepsilon ^p}}} \to {\rm{0}}\]


=============================
Theorem: $L^2$ Weak Law
令 $X_1, ..., X_n$ 為 uncorrelated 隨機變數 且 $E[X_i] = \mu$,$var(X_i) \le C < \infty$。現在定義隨機變數的和 $S_n := X_1 + X_2 + ... + X_n$ 則
\[
S_n/n \rightarrow \mu \text{  in Probability as  $n \rightarrow \infty$}
\]=============================

Proof
要證明 給定 $\varepsilon>0$ 當 $n \rightarrow \infty$
\[P\left( {\left| {\frac{{{S_n}}}{n} - \mu } \right| > \varepsilon } \right) \to 0 \ \ \ \ (*)
\]不過如前一個定理所述,若我們可以證明 $E\left[ {{{\left| {\frac{{{S_n}}}{n} - \mu } \right|}^2}} \right] \to 0$ 則 $(*)$ 自動滿足。

首先觀察
\[E\left[ {\frac{{{S_n}}}{n}} \right] = \frac{1}{n}E\left[ {{S_n}} \right] = \frac{1}{n}\sum\limits_{i = 1}^n {E{X_i}}  = \frac{\mu }{n}n = \mu \]則
\[\begin{array}{l}
E\left[ {{{\left| {\frac{{{S_n}}}{n} - \mu } \right|}^2}} \right] = E\left[ {{{\left| {\frac{{{S_n}}}{n} - E\left[ {\frac{{{S_n}}}{n}} \right]} \right|}^2}} \right]\\
\begin{array}{*{20}{c}}
{}&{}&{}&{}&{}&{}&{}
\end{array} = var\left( {\frac{{{S_n}}}{n}} \right) = \frac{1}{{{n^2}}}var\left( {{S_n}} \right)\\
\begin{array}{*{20}{c}}
{}&{}&{}&{}&{}&{}&{}
\end{array} = \frac{1}{{{n^2}}}var\left( {\sum\limits_{i = 1}^n {{X_i}} } \right)
\end{array}\]由於 $X_1, ..., X_n$ 為 uncorrelated 隨機變數,故我們有 $var\left( {\sum\limits_{i = 1}^n {{X_i}} } \right) = \sum\limits_{i = 1}^n {var\left( {{X_i}} \right)} $ 將此結果帶入上式可得
\[\begin{array}{l}
E\left[ {{{\left| {\frac{{{S_n}}}{n} - \mu } \right|}^2}} \right] = \frac{1}{{{n^2}}}var\left( {\sum\limits_{i = 1}^n {{X_i}} } \right) = \frac{1}{{{n^2}}}\sum\limits_{i = 1}^n {var\left( {{X_i}} \right)} \\
\begin{array}{*{20}{c}}
{}&{}&{}&{}&{}&{}&{}
\end{array} \le \frac{1}{{{n^2}}}Cn = \frac{1}{n}C \to 0
\end{array}\]當 $n \rightarrow \infty$ 故 $L^2$-convergence。又 $L^2$ convergence 保證 convergence in Probability,故
\[P\left( {\left| {\frac{{{S_n}}}{n} - \mu } \right| > \varepsilon } \right) \to 0 \ \ \ \ \square
\]

現在問題變成若我們想要拓展上述的 Weak law (e.g., 拔除 finite 2nd moment 條件 ),則我們必須引入一些新的定義如下:

============================
Definition: Tail Equivalence of Two Sequences of Random Variables
我們說兩隨機變數的 sequences $\{X_n\}$ 與 $\{Y_n \}$ 為 Tail Equivalent 若下列條件成立
\[
\sum_n P(X_n \neq Y_n) < \infty
\]===========================

============================
Definition: Truncation Function
定義以下剪切函數(truncation function)
\[X{1_{\left| X \right| \le M}} := \left\{ \begin{array}{l}
X,\begin{array}{*{20}{c}}
{}&{}
\end{array}\left| X \right| \le M\\
0,\begin{array}{*{20}{c}}
{}&{}
\end{array}\left| X \right| > M
\end{array} \right.\]============================


首先看幾個結果
FACT 1: 若 $Y \ge 0$ 且 $p >0$ 則
\[
E[Y^p] = \int_0^\infty p y^{p-1} P(Y>y)dy
\]Proof: 首先觀察積分
\[\begin{array}{l}
\int_0^\infty  p {y^{p - 1}}P(Y > y)dy = \int_0^\infty  p {y^{p - 1}}\int_\Omega ^{} {{1_{Y > y}}dP} dy\\
\begin{array}{*{20}{c}}
{}&{}&{}&{}&{}&{}&{}&{}
\end{array} = \int_0^\infty  {\int_\Omega ^{} {{1_{Y > y}}} p} {y^{p - 1}}dPdy\\
\begin{array}{*{20}{c}}
{}&{}&{}&{}&{}&{}&{}&{}
\end{array} = \int_\Omega ^{} {\int_0^\infty  {{1_{Y > y}}} p{y^{p - 1}}dydP} \\
\begin{array}{*{20}{c}}
{}&{}&{}&{}&{}&{}&{}&{}
\end{array} = \int_\Omega ^{} {\int_0^Y {p{y^{p - 1}}} dydP}
\end{array}\] 因為 $ \int_0^Y {p{y^{p - 1}}} dy = {Y^p}$ 故代入上式可得
\[ \Rightarrow \int_0^\infty  p {y^{p - 1}}P(Y > y)dy = \int_\Omega ^{} {\int_0^Y {p{y^{p - 1}}} dydP = } \int_\Omega ^{} {{Y^p}dP = } E[{Y^p}] \ \ \ \ \square
\]


現在我們可以介紹 General Weak Law of Large Number:

=============================
Theorem: Weak Law of Large Number
令 $X_1, X_2, ... $ 為 i.i.d. 隨機變數 且 $S_n := X_1 + X_2 + ... + X_n$。若
(1) $\sum_{j=1}^n P(|X_j > n|) \rightarrow 0$
(2) $\frac{1}{n^2} \sum_{j=1}^n E[ X_j^2 1_{|X_j \le n|}] \rightarrow 0$

\[
S_n / n - \mu_n \rightarrow 0 \text{  in Probability}
\] 其中 $a_n := \sum_{j=1}^{n}E [X_j 1 _{|X_j| \le n}]$
=============================

Comments
在證明之前有幾點值得注意,上述 Weak Law of Large Number 並無對 $E[X_i^2]$ 有做假設

Proof
首先定義 $X_{nj}' := X_j 1_{|X_j| \le n}$ 且 $S_n' := \sum_{j=1}^n X_{nj}'$ 則觀察 Tail parts
\[\sum\limits_{j = 1}^n {P\left( {{X_{nj}}' \ne {X_j}} \right)}  = \sum\limits_{j = 1}^n {P\left( {\left| {{X_j}} \right| > n} \right) \to 0}
\]上述收斂成立由 Hypothesis $(1)$。接著我們觀察
\[\begin{array}{l}
P\left( {\left| {{S_n} - {S_n}'} \right| > \varepsilon } \right) \le P\left( {{S_n} \ne {S_n}'} \right)\\
\begin{array}{*{20}{c}}
{}&{}&{}&{}&{}&{}&{}
\end{array} \le P\left( {\bigcup\limits_{j = 1}^n {\left\{ {{X_{nj}}' \ne {X_j}} \right\}} } \right)\\
\begin{array}{*{20}{c}}
{}&{}&{}&{}&{}&{}&{}
\end{array} \le \sum\limits_{j = 1}^n {P\left( {{X_{nj}}' \ne {X_j}} \right)}  = \sum\limits_{j = 1}^n {P\left( {\left| {{X_j}} \right| > n} \right) \to 0}
\end{array}\]故可知
\[
S_n - S_n' \rightarrow 0 \text{  in Probability }
\]
現在觀察
\[\begin{array}{l}
P\left( {\frac{{\left| {{S_n}' - E{S_n}'} \right|}}{n} > \varepsilon } \right) \le \frac{{E\left[ {{{\left| {{S_n}' - E{S_n}'} \right|}^2}} \right]}}{{{n^2}{\varepsilon ^2}}} = \frac{{{\mathop{\rm var}} \left( {{S_n}'} \right)}}{{{n^2}{\varepsilon ^2}}}\\
\begin{array}{*{20}{c}}
{}&{}&{}&{}&{}&{}&{}&{}&{}
\end{array}{\rm{ = }}\frac{{{\mathop{\rm var}} \left( {{S_n}'} \right)}}{{{n^2}{\varepsilon ^2}}}{\rm{ = }}\frac{1}{{{n^2}{\varepsilon ^2}}}\sum\limits_{j = 1}^n {{\mathop{\rm var}} \left( {{X_{nj}}'} \right)} \\
\begin{array}{*{20}{c}}
{}&{}&{}&{}&{}&{}&{}&{}&{}
\end{array} \le \frac{1}{{{n^2}{\varepsilon ^2}}}\sum\limits_{j = 1}^n {E\left[ {{{\left( {{X_{nj}}'} \right)}^2}} \right]} \\
\begin{array}{*{20}{c}}
{}&{}&{}&{}&{}&{}&{}&{}&{}
\end{array} = \frac{1}{{{n^2}{\varepsilon ^2}}}\sum\limits_{j = 1}^n {E\left[ {{{\left( {{X_j}{1_{\left\{ {\left| {{X_j}} \right| \le n} \right\}}}} \right)}^2}} \right]} \\
\begin{array}{*{20}{c}}
{}&{}&{}&{}&{}&{}&{}&{}&{}
\end{array} = \frac{1}{{{n^2}{\varepsilon ^2}}}\sum\limits_{j = 1}^n {E\left[ {{X_j}^2{1_{\left\{ {\left| {{X_j}} \right| \le n} \right\}}}} \right]}  \to 0 \ \ \ \ (**)
\end{array}\]上式收斂結果來自 Hypothesis (2)。

注意到\[{a_n}: = \sum\limits_{j = 1}^n E [{X_j}{1_{|{X_j}| \le n}}] = E\left[ {\sum\limits_{j = 1}^n {{X_j}{1_{|{X_j}| \le n}}} } \right] = E\left[ {{S_n}'} \right]\]故由 $(**)$ 可知
\[\frac{{{S_n}' - {a_n}}}{n}\mathop  \to \limits^P 0\]故現在觀察
\[\frac{{{S_n} - {a_n}}}{n} = \frac{{{S_n} - {S_n}' + {S_n}' - {a_n}}}{n} = \frac{{{S_n} - {S_n}'}}{n} + \frac{{{S_n}' - {a_n}}}{n}\mathop  \to \limits^P 0\]

沒有留言:

張貼留言

[數學分析] 連續函數族的逐點上包絡函數不一定連續

連續函數有諸多用途,一般在參數最佳化領域中常見的情況是考慮所謂的 上包絡函數(upper envelope function)。 Definition:  定義函數族 \(\{f_t : t \in T\} \) 其中 \(T\) 為 index set 並考慮對任意 \(x ...