Fair PCA, One Component at a Time
Antonis Matakos, Martino Ciaperoni, Heikki Mannila
TL;DR
This paper introduces Fair-PC, a containment-preserving variant of Min-Max Fair PCA that incrementally constructs an orthonormal sequence of fair principal components by minimizing the worst-group reconstruction error for each rank-1 component. The authors derive a dual formulation showing each fair component corresponds to the leading eigenvector of a convex combination of group covariances, enabling scalable optimization via Frank-Wolfe and an SDP relaxation, with strong duality in the two-group case. They prove exact optimality and strong duality for $|oldsymbol{ ext{G}}|=2$, and demonstrate empirically that Fair-PC achieves balanced group reconstruction across ranks and outperforms previous FAIR-PCA approaches in both fairness metrics and runtime. The method retains the standard PCA containment property, enabling nested fair subspaces and practical use in feature selection, while remaining scalable to datasets with multiple groups. Limitations include the lack of formal guarantees for more than two groups and the assumption of fixed group membership.
Abstract
The Min-Max Fair PCA problem seeks a low-rank representation of multi-group data such that the the approximation error is as balanced as possible across groups. Existing approaches to this problem return a rank-$d$ fair subspace, but lack the fundamental containment property of standard PCA: each rank-$d$ PCA subspace should contain all lower-rank PCA subspaces. To fill this gap, we define fair principal components as directions that minimize the maximum group-wise reconstruction error, subject to orthogonality with previously selected components, and we introduce an iterative method to compute them. This approach preserves the containment property of standard PCA, and reduces to standard \pca for data with a single group. We analyze the theoretical properties of our method and show empirically that it outperforms existing approaches to Min-Max Fair PCA.
