Table of Contents
Fetching ...

A Tutorial on Principal Component Analysis

Jonathon Shlens

TL;DR

PCA addresses how to reveal simple structure in high-dimensional data by seeking a linear change of basis that concentrates variance into a few directions. The paper presents both intuitive toy- examples and rigorous linear-algebra derivations, showing that PCA is tightly linked to eigenvector decompositions of the covariance matrix and to the singular value decomposition (SVD). It provides practical steps for centering data, computing PCA via eigen decomposition or SVD, and interpreting the resulting principal components and variances. It also discusses limitations, including nonlinearity and higher-order dependencies, and points to kernel PCA and ICA as extensions when linear, second-order assumptions are insufficient.

Abstract

Principal component analysis (PCA) is a mainstay of modern data analysis - a black box that is widely used but (sometimes) poorly understood. The goal of this paper is to dispel the magic behind this black box. This manuscript focuses on building a solid intuition for how and why principal component analysis works. This manuscript crystallizes this knowledge by deriving from simple intuitions, the mathematics behind PCA. This tutorial does not shy away from explaining the ideas informally, nor does it shy away from the mathematics. The hope is that by addressing both aspects, readers of all levels will be able to gain a better understanding of PCA as well as the when, the how and the why of applying this technique.

A Tutorial on Principal Component Analysis

TL;DR

PCA addresses how to reveal simple structure in high-dimensional data by seeking a linear change of basis that concentrates variance into a few directions. The paper presents both intuitive toy- examples and rigorous linear-algebra derivations, showing that PCA is tightly linked to eigenvector decompositions of the covariance matrix and to the singular value decomposition (SVD). It provides practical steps for centering data, computing PCA via eigen decomposition or SVD, and interpreting the resulting principal components and variances. It also discusses limitations, including nonlinearity and higher-order dependencies, and points to kernel PCA and ICA as extensions when linear, second-order assumptions are insufficient.

Abstract

Principal component analysis (PCA) is a mainstay of modern data analysis - a black box that is widely used but (sometimes) poorly understood. The goal of this paper is to dispel the magic behind this black box. This manuscript focuses on building a solid intuition for how and why principal component analysis works. This manuscript crystallizes this knowledge by deriving from simple intuitions, the mathematics behind PCA. This tutorial does not shy away from explaining the ideas informally, nor does it shy away from the mathematics. The hope is that by addressing both aspects, readers of all levels will be able to gain a better understanding of PCA as well as the when, the how and the why of applying this technique.

Paper Structure

This paper contains 21 sections, 34 equations, 6 figures.

Figures (6)

  • Figure 1: A toy example. The position of a ball attached to an oscillating spring is recorded using three cameras A, B and C. The position of the ball tracked by each camera is depicted in each panel below.
  • Figure 2: Simulated data of $(x, y)$ for camera A. The signal and noise variances $\sigma_{signal}^{2}$ and $\sigma_{noise}^{2}$ are graphically represented by the two lines subtending the cloud of data. Note that the largest direction of variance does not lie along the basis of the recording $(x_A, y_A)$ but rather along the best-fit line.
  • Figure 3: A spectrum of possible redundancies in data from the two separate measurements $r_1$ and $r_2$. The two measurements on the left are uncorrelated because one can not predict one from the other. Conversely, the two measurements on the right are highly correlated indicating highly redundant measurements.
  • Figure 4: Construction of the matrix form of SVD (Equation \ref{['eqn:svd-matrix']}) from the scalar form (Equation \ref{['eqn:value-svd']}).
  • Figure 5: A step-by-step instruction list on how to perform principal component analysis
  • ...and 1 more figures