Non-linear PCA via Evolution Strategies: a Novel Objective Function
Thomas Uriot, Elise Chung
TL;DR
This work introduces a non-linear PCA framework that preserves interpretability by parameterizing per-variable transformations with neural networks and optimizing them via Evolution Strategies to bypass nondifferentiable eigendecomposition. A key innovation is a granular partial objective that decomposes the global explained variance into per-variable contributions $c_{j,l}$, enabling stronger learning signals and better handling of mixed numerical, categorical, and ordinal data. The approach demonstrates improved explained variance over linear PCA and kernel PCA on synthetic and real OpenML datasets, while maintaining visualizable interpretability through standard tools like biplots. The method offers a scalable, interpretable NLPCA framework with potential broad impact on mixed-data dimensionality reduction and exploratory data analysis.
Abstract
Principal Component Analysis (PCA) is a powerful and popular dimensionality reduction technique. However, due to its linear nature, it often fails to capture the complex underlying structure of real-world data. While Kernel PCA (kPCA) addresses non-linearity, it sacrifices interpretability and struggles with hyperparameter selection. In this paper, we propose a robust non-linear PCA framework that unifies the interpretability of PCA with the flexibility of neural networks. Our method parametrizes variable transformations via neural networks, optimized using Evolution Strategies (ES) to handle the non-differentiability of eigendecomposition. We introduce a novel, granular objective function that maximizes the individual variance contribution of each variable providing a stronger learning signal than global variance maximization. This approach natively handles categorical and ordinal variables without the dimensional explosion associated with one-hot encoding. We demonstrate that our method significantly outperforms both linear PCA and kPCA in explained variance across synthetic and real-world datasets. At the same time, it preserves PCA's interpretability, enabling visualization and analysis of feature contributions using standard tools such as biplots. The code can be found on GitHub.
