Dimensionality Reduction on Riemannian Manifolds in Data Analysis
Alaa El Ichi, Khalide Jbilou
TL;DR
The paper addresses dimensionality reduction for data constrained to nonlinear Riemannian manifolds, where standard Euclidean PCA can distort the intrinsic structure. It surveys and develops geometry-aware methods, centering on Principal Geodesic Analysis and Riemannian adaptations of PCA, LDA, Isomap, Laplacian Eigenmaps, and SVM, all leveraging intrinsic geometry. A unifying framework maps data to tangent spaces around the Fréchet mean, performs linear analysis there, and maps results back to the manifold when needed, enabling intrinsic embeddings. Experimental results across real, manifold, and spherical datasets show that Riemannian methods achieve higher representation fidelity and classification accuracy, with especially strong gains on curved spaces such as hyperspheres and SPD manifolds, underscoring the practical value of geometry-aware dimensionality reduction.
Abstract
In this work, we investigate Riemannian geometry based dimensionality reduction methods that respect the underlying manifold structure of the data. In particular, we focus on Principal Geodesic Analysis (PGA) as a nonlinear generalization of PCA for manifold valued data, and extend discriminant analysis through Riemannian adaptations of other known dimensionality reduction methods. These approaches exploit geodesic distances, tangent space representations, and intrinsic statistical measures to achieve more faithful low dimensional embeddings. We also discuss related manifold learning techniques and highlight their theoretical foundations and practical advantages. Experimental results on representative datasets demonstrate that Riemannian methods provide improved representation quality and classification performance compared to their Euclidean counterparts, especially for data constrained to curved spaces such as hyperspheres and symmetric positive definite manifolds. This study underscores the importance of geometry aware dimensionality reduction in modern machine learning and data science applications.
