謝宗翰的隨筆 : [最佳化理論] Conjugate Direction Methods (0) -Basic Theory

這次要介紹的是最佳化理論中的一類算法；叫做 Conjugate Direction Methods (共軛方向演算法)。此次我們主要focus在理論部分。實際演算法實現留待之後再介紹
(想要看算法的讀者建議直接閱讀 [最佳化理論] Conjugate Direction Methods (2) - The Conjugate Gradient Algorithm for Quadratic Objective function )

注意這邊我用 Method"s"，表示所謂的Conjugate Direction Method有很多種。所以我們稱之為這一類。在介紹之前先說說這類計算方法有甚麼特色

對於 2階具有n個變數的目標函數 (Quadratic Objective function with n variables) 可以在 n步驟內求解。(也就是對二階目標函數收斂性很好 (求解的計算速度夠快) )
一般而言常用的 Conjugate Direction Gradient Algorithm 方法不需要計算 Hessian Matrix
Conjugate Direction Methods 不須計算反矩陣

Comments:

上述的2階具有n個變數的目標函數 表示如下：
對 $u \in \mathbb{R}^n$ ， $Q$ 為對稱且正定矩陣 $Q=Q^T \succ 0$ ，目標函數寫為

\[
J(u) = \frac{1}{2} u^T Q u - u^T b
\]
那麼現在我們來問問，什麼叫Conjugate Direction??
本質上來說就是他是一個方向 (也就是向量 )! 具有共軛 (Conjugate) 的性質。所以我們得先知道什麼叫做 Conjugate；以下一組向量 $ d^{(0)}, d^{(1)}, ..., d^{(m)}$ 被稱作 Conjugate 的定義

-----------------
${ \bf \text{Definition: (Q-Conjugate)}}$
令 $Q$ 為一個 real symmetric $n \times n$ 矩陣。其方向 $ d^{(0)}, d^{(1)}, ..., d^{(m)}$ 稱作 Q-conjugate 如果下列條件成立：
\[
\forall i \neq j, \ d^{(i)T} Q d^{(j)} =0
\]-----------------

知道這個Conjugate定義之後會問說要知道這個定義有甚麼用呢? 用處就是一旦有了Conjugate direction，則這些directions會彼此線性獨立。我們將此結果記做下面引理

-----------------
$\text{Lemma: (Q-Conjugate direction is linearly independent)}$
令 $Q$ 為一個對稱且正定 (symmetric and positive definite) $n \times n$ 矩陣。若下列方向 $ d^{(0)}, d^{(1)}, ..., d^{(n-1)} \in \mathbb{R}^n$ 為非零向量且 Q-conjugate ，則這些方向互為線性獨立。

-----------------
Proof:
我們要證明 $ d^{(0)}, d^{(1)}, ..., d^{(n-1)}$ 互為線性獨立。由線性獨立定義，假設
\[
\displaystyle \sum_{i=0} ^{n-1} \alpha_i d^{(i)} =0,
\]必須證明 $\forall \alpha_i =0$

現在如果我們考慮對任意 $\alpha_k$ 的情況，將上述summation同乘 $d^{(k)T} Q$，可得
\[
d^{(k)T} Q \displaystyle \sum_{i=0} ^{n-1} \alpha_i d^{(i)} =0 \Rightarrow \displaystyle \sum_{i=0} ^{n-1} \alpha_i d^{(k)T} Q d^{(i)} =0 \ \ \ \ (*)
\]現在由於前提可知 $ d^{(0)}, d^{(1)}, ..., d^{(n-1)}$ 為 Q-conjugate，由Q-conjugate 定義可知 $\forall i \neq j$, $d^{(i)T}Qd^{(j)} =0$

故對所有的 $i \neq k$， $d^{(k)T}Qd^{(i)} =0$ ，亦即：

\[
\displaystyle \sum_{i=0} ^{n-1} \alpha_i d^{(k)T} Q d^{(i)} =0 ; \Rightarrow \displaystyle \alpha_k d^{(k)T} Q d^{(k)} =0
\]又因為我們的前提告訴我們 $Q$ 為一個 symmetric positive definite $n \times n$ 矩陣，且方向 $ d^{(0)}, d^{(1)}, ..., d^{(n-1)} \in \mathbb{R}^n$ 為非零向量；

由正定矩陣定義： $d^{(k)T} Q d^{(k)} > 0$, $\forall d^{(k)} \neq 0$。故可知
\[
(*) \Rightarrow {\alpha _k}\underbrace {{d^{(k)T}}Q{d^{(k)}}}_{ > 0} = 0
\]亦即
\[
\Rightarrow {\alpha _k} = 0
\]現在由於 $\alpha_k$ 為任意給定。故
\[
\displaystyle \sum_{i=0} ^{n-1} \alpha_k d^{(k)} =0, \ \forall \alpha_k =0
\]即為所求。 $\square$

ref: E. K. P. Chong, S. H. Zak, An Introduction to Optimization 2nd.

----
延伸閱讀
[最佳化理論] Conjugate Direction Methods (1) - Basic Algorithm for Quadratic Objective function

[最佳化理論] Conjugate Direction Methods (2) - The Conjugate Gradient Algorithm for Quadratic Objective function

謝宗翰的隨筆

3/19/2014

[最佳化理論] Conjugate Direction Methods (0) -Basic Theory

沒有留言:

張貼留言

[測度論] 期望值下確界與函數值下確界之恆等式

搜尋此網誌