PCA数学推导

发布于 25 天前  132 次阅读


PCA数学推导

给定两个两个数据如下

$X$ $Y$
$a$ $a1$ $b1$
$b$ $a2$ $b2$

$$
\boldsymbol{a}=\begin{pmatrix}X_1\\Y_1\end{pmatrix}\quad\boldsymbol{b}=\begin{pmatrix}X_2\\Y_2\end{pmatrix}\
$$

在自然基$\boldsymbol{e_1}=\begin{pmatrix}1\\0\end{pmatrix},\boldsymbol{e_2}=\begin{pmatrix}0\\1\end{pmatrix}$下坐标值为:
$$
\boldsymbol{a}=\begin{pmatrix}X_1\\Y_1\end{pmatrix}=\begin{pmatrix}a_1\\b_1\end{pmatrix}\quad \boldsymbol{b}=\begin{pmatrix}X_2\\Y_2\end{pmatrix}=\begin{pmatrix}a_2\\b_2\end{pmatrix}\
$$

我们希望降低维度也就是:

$$
X_1^2+X_2^2=\sum_{i=1}^2 X_i^2\ \ 最大\
$$

根据内积的几何意义得:
$$
X_1=\boldsymbol{a}\cdot\boldsymbol{e_1}=\begin{pmatrix}a_1&b_1\end{pmatrix}\cdot\begin{pmatrix}e_{11}\\e_{12}\end{pmatrix}=a_1e_{11}+b_1e_{12}\
$$

$$
X_2=\boldsymbol{b}\cdot\boldsymbol{e_1}=\begin{pmatrix}a_2&b_2\end{pmatrix}\cdot\begin{pmatrix}e_{11}\\e_{12}\end{pmatrix}=a_2e_{11}+b_2e_{12}\
$$


$$
\begin{aligned} X_1^2+X_2^2 &=(a_1e_{11}+b_1e_{12})^2+(a_2e_{11}+b_2e_{12})^2\\ \ &=a_1^2e_{11}^2+2a_1b_1e_{11}e_{12}+b_1^2e_{12}^2+a_2^2e_{11}^2+2a_2b_2e_{11}e_{12}+b_2^2e_{12}^2\\ \ &=(a_1^2+a_2^2)e_{11}^2+2(a_1b_1+a_2b_2)e_{11}e_{12}+(b_1^2+b_2^2)e_{12}^2\end{aligned}\
$$
其是二次型
$$
X_1^2+X_2^2=\boldsymbol{e_1}^\mathrm{T}\underbrace{\begin{pmatrix}a_1^2+a_2^2&a_1b_1+a_2b_2\\a_1b_1+a_2b_2&b_1^2+b_2^2\end{pmatrix}}_{P}\boldsymbol{e_1}=\boldsymbol{e_1}^\mathrm{T}P\boldsymbol{e_1}\
$$
可以根据正交交换化变为标准型
$$
\begin{aligned}X_1^2+X_2^2 &=\boldsymbol{e_1}^\mathrm{T}P\boldsymbol{e_1}\\ \ &=\boldsymbol{e_1}^\mathrm{T}U\Sigma U^\mathrm{T}\boldsymbol{e_1}\\ \ &=(U^\mathrm{T}\boldsymbol{e_1})^\mathrm{T}\Sigma(U^\mathrm{T}\boldsymbol{e_1})\end{aligned}\
$$
其中系数为全部特征值:
$$
\Sigma=\begin{pmatrix}\sigma_1&0\\0&\sigma_2\end{pmatrix}\
$$
令:
$$
\boldsymbol{n}=U^\mathrm{T}\boldsymbol{e_1}\
$$
也为单位向量,所以:
$$
\boldsymbol{n}=\begin{pmatrix}n_1\\n_2\end{pmatrix}\implies n_1^2+n_2^2=1\
$$
带回得到:
$$
\begin{aligned}X_1^2+X_2^2 &=(U^\mathrm{T}\boldsymbol{e_1})^\mathrm{T}\Sigma(U^\mathrm{T}\boldsymbol{e_1})\\ \ &=\boldsymbol{n}^\mathrm{T}\Sigma\boldsymbol{n}\\ \ &=\begin{pmatrix}n_1&n_2\end{pmatrix}\begin{pmatrix}\sigma_1&0\\0&\sigma_2\end{pmatrix}\begin{pmatrix}n_1\\n_2\end{pmatrix}\\ \ &=\sigma_1n_1^2+\sigma_2n_2^2\end{aligned}\
$$

原问题变为

$$
X_1^2+X_2^2=\sum_{i=0}^2 X_i^2\ \ 最大\iff\begin{cases}\sigma_1n_1^2+\sigma_2n_2^2\ \ 最大\\n_1^2+n_2^2=1\\\sigma_1 > \sigma_2\end{cases}\
$$

解得$n_1=1,n_2=0$
$$
\boldsymbol{n}=\begin{pmatrix}1\\0\end{pmatrix}=U^\mathrm{T}\boldsymbol{e_1}\implies \boldsymbol{e_1}=U\begin{pmatrix}1\\0\end{pmatrix}\
$$
令:
$$
\boldsymbol{X}=\begin{pmatrix}a_1\\a_2\end{pmatrix}\quad \boldsymbol{Y}=\begin{pmatrix}b_1\\b_2\end{pmatrix}\
$$
则$P$矩阵:
$$
P=\begin{pmatrix}a_1^2+a_2^2&a_1b_1+a_2b_2\\a_1b_1+a_2b_2&b_1^2+b_2^2\end{pmatrix}=\begin{pmatrix}X\cdot X&X\cdot Y\\X\cdot Y&Y\cdot Y\end{pmatrix}\
$$
而协方差矩阵$Q$与$P$的关系是:
$$
Q=\frac{1}{n}P=\begin{pmatrix}Var(X)&Cov(X,Y)\\Cov(X,Y)&Var(Y)\end{pmatrix}\
$$

所以我们可以进行对$Q$奇异值分解,便可以得到奇异向量作为其坐标向量,也就把PCA与协方差矩阵联系到了一起


浪子三唱,不唱悲歌