Table of Contents
Fetching ...

Theoretical Foundations of Principal Manifold Estimation with Non-Euclidean Templates

Kun Meng, Christopher Perez

Abstract

We develop a rigorous theoretical framework for principal manifold estimation that recovers a latent low-dimensional manifold from a point cloud observed in a high-dimensional ambient space. Our framework accommodates manifolds with general, potentially non-Euclidean topology, which can be inferred using tools from topological data analysis. Using the theory of Sobolev spaces on Riemannian manifolds, we establish that the proposed principal manifolds are well defined, prove convergence of the iterative algorithm used to compute them, and show consistency of the finite-sample estimator. Furthermore, we introduce a novel method for selecting the complexity level of a fitted manifold, which addresses the shortcomings of the classical fitting-error criterion. We also provide a detailed geometric interpretation of the penalty term in our framework. In addition to the theoretical developments, we present extensive numerical experiments supporting our results. This article provides theoretical foundations for approaches that have been used in applications such as robotics. More importantly, it extends these approaches to general topological settings with potential applications across a broad range of disciplines, including neuroimaging and shape data analysis.

Theoretical Foundations of Principal Manifold Estimation with Non-Euclidean Templates

Abstract

We develop a rigorous theoretical framework for principal manifold estimation that recovers a latent low-dimensional manifold from a point cloud observed in a high-dimensional ambient space. Our framework accommodates manifolds with general, potentially non-Euclidean topology, which can be inferred using tools from topological data analysis. Using the theory of Sobolev spaces on Riemannian manifolds, we establish that the proposed principal manifolds are well defined, prove convergence of the iterative algorithm used to compute them, and show consistency of the finite-sample estimator. Furthermore, we introduce a novel method for selecting the complexity level of a fitted manifold, which addresses the shortcomings of the classical fitting-error criterion. We also provide a detailed geometric interpretation of the penalty term in our framework. In addition to the theoretical developments, we present extensive numerical experiments supporting our results. This article provides theoretical foundations for approaches that have been used in applications such as robotics. More importantly, it extends these approaches to general topological settings with potential applications across a broad range of disciplines, including neuroimaging and shape data analysis.

Paper Structure

This paper contains 67 sections, 28 theorems, 302 equations, 15 figures, 1 algorithm.

Key Result

Theorem 2.1

Let $(\mathfrak{M},g)$ be a compact $d$-dimensional Riemannian manifold with $d<4$. Suppose its boundary $\partial \mathfrak{M}$ is either empty (i.e., $\mathfrak{M}$ is a closed manifold in this case) or smooth. Then, $H^2(\mathfrak{M}) \subseteq C(\mathfrak{M})$, and the inclusion map $\iota:\, H^

Figures (15)

  • Figure 1.1: A curve $f(\mathfrak I)$ (black) fitted to a point cloud (orange), represented as the image of a function $\boldsymbol{f}:\mathfrak{I}\rightarrow\mathbb{R}^D$. For an observation $\boldsymbol X$, the point $\Pi_{\boldsymbol{f}}(\boldsymbol{X})$ (red) is its nearest-point projection onto the curve, and $\boldsymbol{\pi}_{\boldsymbol{f}}(\boldsymbol{X})$ is the projection index defined in Section \ref{['section: Projection Indices']}. The residual $\boldsymbol{X} - \Pi_{\boldsymbol{f}}(\boldsymbol{X})$ (blue) is perpendicular to the curve at the projection $\Pi_{\boldsymbol{f}}(\boldsymbol{X})$ (right-angle marker), and the brace indicates the error $\|\boldsymbol X-\Pi_f(\boldsymbol X)\|$.
  • Figure 2.1: The point clouds in panels (b,e) are generated via the mechanisms described in Appendices \ref{['appendix: Half Circle Point Cloud']} and \ref{['appendix: Boundary of a Flower/Star']}, respectively. The persistent diagrams fasy2014confidence corresponding to these two point clouds are shown in panels (a,d), respectively. The significant homology features in the PDs correctly identify the topologies of the manifolds underlying the point clouds. The curves displayed in panels (b,e) are fitted using the PA algorithm (Algorithm \ref{['alg: empirical pme']}) and correspond to an excessively small, moderate, and excessively large $\lambda$ value. These numerical results validate Theorem \ref{['thm: PME with lambda=infty']}. Note that the moderate value of $\lambda$ is not optimal. The optimal choice of $\lambda$ is described in Section \ref{['section: Tuning Parameter Selection']} and indicated in Figure \ref{['fig:pme_demo_cv_lambda_selection']}. Panels (c,f) display the nonincreasing values of the cost functional $\{\mathcal{L}_{N,\lambda}(\boldsymbol{f}^{(n)}_{N,\lambda})\}_{n\in\mathbb{N}}$, where the penalty term $\Vert \nabla{}^2 \boldsymbol{f}^{(n)}_{N,\lambda} \Vert^2_{L^2(\mathfrak{M})}$ is computed using Lemma \ref{['lemma: closed form penalty circle']}. Theorem \ref{['thm: the convergence theorem of the core iterative algorithm']} implies that $\mathcal{L}_{N,\lambda}(\boldsymbol{f}^{(n)}_{N,\lambda})$ converges to $\mathcal{L}_{N,\lambda}(\boldsymbol{f}^{*}_{N,\lambda})$, as the number of iterations $n\rightarrow\infty$, under the regularity conditions specified in Theorem \ref{['thm: the convergence theorem of the core iterative algorithm']}, where $\boldsymbol{f}^*_{N,\lambda}=\mathop{\mathrm{\arg\!\min}}\limits_{\boldsymbol{f}\in\mathscr{F}(\mathbb{P}_N)} \,\mathcal{L}_{N,\lambda}(\boldsymbol{f})$ and $\mathbb{P}_N:=\frac{1}{N}\sum_{i=1}^N \delta_{\boldsymbol{X}_i}$.
  • Figure 2.2: The blue curve is an ellipse with semi-axes $a>b$. The red segment is its medial axis blum1967transformation, i.e., the set of points that admit more than one nearest point on the ellipse; its endpoints occur at $x=\pm (a^{2}-b^{2})/a$. For a point $\boldsymbol{x}$ away from the medial axis, $\mathop{\mathrm{\arg\!\min}}\limits_{\boldsymbol{m}\in\mathfrak{M}} \Vert \boldsymbol{x}-\boldsymbol{f}(\boldsymbol{m})\Vert$ is a singleton, where $\mathfrak{M}=\mathbb{S}^1$ is the template manifold homeomorphic to the ellipse, and $\boldsymbol{f}:\mathbb{S}^1\rightarrow\mathbb{R}^2$ parametrizes the ellipse. Then, $\boldsymbol{\pi}_{\boldsymbol{f}}(\boldsymbol{x})$ is equal to the point in the singleton. In contrast, points $\boldsymbol{x}$ on the medial axis have at least two closest points on the ellipse, and $\boldsymbol{\pi}_{\boldsymbol{f}}(\boldsymbol{x})$ is defined via a measurable selection (see Lemma \ref{['lemma: Borel measurable selection of nearest-point']}).
  • Figure 3.1: The star- and cashew-shaped point clouds in the first and second rows are generated via the mechanisms described in Appendices \ref{['appendix: Surface of a Flower/Star (2d, 3D)']} and \ref{['section: Surface of a Moon/Cashew (2d, 3D)']}, respectively. We apply the PA algorithm to the point clouds. The top row is initialized using only spherical normalization, and the bottom row using ISOMAP followed by spherical normalization (see Appendix \ref{['section: Initialization of the PA algorithm']} for details). Each column displays the fitting results corresponding to a prespecified value of $\lambda$. This figure shows that a fitted closed surface gradually shrinks to the center of a point cloud as $\lambda$ approaches $\infty$, as stated in Theorem \ref{['thm: PME with lambda=infty']}.
  • Figure 3.2: Illustration of the PA algorithm and the role of Assumption \ref{['assumption: Assumption for the convergence of the iterative algorithm']}. The vertical axis denotes the objective $\mathcal{L}_\lambda(f)$ and the horizontal axis represents the function space $\mathscr F(\mathbb{P})$ (illustration only, not to scale). The region $\mathscr U$ (green bracket; boundaries indicated by dashed vertical lines) is a "basin of attraction" in which the updating operator $\mathcal{T}_\lambda$ is intended to operate. Starting from an initialization $\boldsymbol{f}_\lambda^{(0)}\in\mathscr U$, successive iterations $\boldsymbol{f}_\lambda^{(n+1)}=\mathcal{T}_\lambda(\boldsymbol{f}_\lambda^{(n)})$ (orange points connected by blue arrows) decrease the objective and move toward the unique global minimizer $\boldsymbol{f}_\lambda^\ast\in\mathscr U$ (red point). The dashed portions of the objective curve indicate behavior outside $\mathscr U$, where additional stationary points or irregularities may occur and where the convergence guarantee is not asserted.
  • ...and 10 more figures

Theorems & Definitions (60)

  • Theorem 2.1: Rellich–Kondrachov theorem
  • Lemma 2.1
  • Definition 2.1
  • Theorem 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Theorem 3.4
  • Theorem 5.1
  • Theorem 5.2
  • Lemma B.1
  • ...and 50 more