This post discusses the Principal component analysis (PCA) dimension reduction technique and demonstrates its application in treasury yield curve analysis.
Introduction
Principal component analysis (PCA) rotates original dataset in such a way that the rotated dateaset is orthogonal and best represents the data variations. Then it becomes a dimension reduction technique by taking first few variables in the rotated dataset. That is, denote \(X \in R^{n \times p}\) be the original dataset, and let \(W \in R^{p \times p}\) be the rotation operator, then the new dataset is
\[ T=XW \;\; or \;\; T_L= XW_L \]
where \(W_L \in R^{p \times l}\) keeps the first \(L\) components of \(W\) according to their eigenvalues. \(W\) is the eigenvectors of covariance matrrix \(X^TX\) and usually obtained by performing SVD decomposition on \(X\) directly.
\[ X=U\Sigma W^T \Rightarrow X^TX=W\Sigma^2W^T \]
In this post we apply PCA to USD treasury curves. Treasury curves are known to be correlated, and first three principal components, namely, level, spread, and fly, explain most of the curve variations. The notebook can be found here.