Robust Tangent Space Estimation via Laplacian Eigenvector Gradient Orthogonalization
Dhruv Kohli, Sawyer J. Robertson, Gal Mishne, Alexander Cloninger
TL;DR
This work addresses the fragile nature of local tangent-space estimation under noise by introducing LEGO, a spectral method that leverages the gradients of low-frequency global graph Laplacian eigenvectors to robustly estimate tangent spaces. The authors provide two theoretical foundations: a differential-geometric analysis on tubular neighborhoods showing low-frequency eigenfunctions align with the tangent bundle, and a random-matrix analysis establishing noise-robust convergence of the Laplacian and its eigenvectors. Empirically, LEGO consistently outperforms LPCA in noisy settings and delivers tangible gains across manifold learning, boundary detection, and local intrinsic-dimension estimation, including accurate torus-structured embeddings via tear-based alignment. Together, these results demonstrate that exploiting global geometric information via Laplacian eigenvectors yields more reliable local geometry, with broad practical implications for downstream data analysis tasks.
Abstract
Estimating the tangent spaces of a data manifold is a fundamental problem in data analysis. The standard approach, Local Principal Component Analysis (LPCA), struggles in high-noise settings due to a critical trade-off in choosing the neighborhood size. Selecting an optimal size requires prior knowledge of the geometric and noise characteristics of the data that are often unavailable. In this paper, we propose a spectral method, Laplacian Eigenvector Gradient Orthogonalization (LEGO), that utilizes the global structure of the data to guide local tangent space estimation. Instead of relying solely on local neighborhoods, LEGO estimates the tangent space at each data point by orthogonalizing the gradients of low-frequency eigenvectors of the graph Laplacian. We provide two theoretical justifications of our method. First, a differential geometric analysis on a tubular neighborhood of a manifold shows that gradients of the low-frequency Laplacian eigenfunctions of the tube align closely with the manifold's tangent bundle, while an eigenfunction with high gradient in directions orthogonal to the manifold lie deeper in the spectrum. Second, a random matrix theoretic analysis also demonstrates that low-frequency eigenvectors are robust to sub-Gaussian noise. Through comprehensive experiments, we demonstrate that LEGO yields tangent space estimates that are significantly more robust to noise than those from LPCA, resulting in marked improvements in downstream tasks such as manifold learning, boundary detection, and local intrinsic dimension estimation.
