Manifold Fitting under Unbounded Noise

Zhigang Yao; Yuqing Xia

Manifold Fitting under Unbounded Noise

Zhigang Yao, Yuqing Xia

TL;DR

This work tackles fitting a latent $d$-dimensional manifold ${\\cal M} \\subset \\mathbb{R}^D$ from data corrupted by unbounded Gaussian noise $\\xi_i \\sim G_{\\sigma}$ where $x_i = y_i + \\xi_i$ and $y_i \\sim U({\\cal M})$. It introduces ${\\cal M}_{out}$ via an implicit bias function $f$ built from a weighted tangent-space estimator $\\Psi_x^{\\alpha}$ that aggregates local projections $P_{x_i}$ at projected points, rather than the noisy samples themselves. Theoretical contributions show that, with high probability, ${\\cal M}_{out}$ is a $d$-dimensional, smooth manifold with a Hausdorff distance to ${\\cal M}$ of order $O(r^2)$ when $r = O(\\sqrt{\\sigma})$, and that the bias and derivatives of $f$ are tightly controlled (e.g., $\\|f(x)\\|_2 \\le C r^2$ on ${\\cal M}$, and $\\|\\partial_v f(x) - {\\Psi_x^{\\alpha}} v\\|_2 \\le C r$). Empirical validation on synthetic manifolds (circle, sphere, torus) and facial image denoising demonstrates improved accuracy over prior methods under unbounded Gaussian noise, highlighting the practical resilience and applicability of the approach. The paper thus provides a principled framework for manifold fitting under realistic, unbounded-noise conditions with rigorous convergence and smoothness guarantees.

Abstract

There has been an emerging trend in non-Euclidean statistical analysis of aiming to recover a low dimensional structure, namely a manifold, underlying the high dimensional data. Recovering the manifold requires the noise to be of certain concentration. Existing methods address this problem by constructing an approximated manifold based on the tangent space estimation at each sample point. Although theoretical convergence for these methods is guaranteed, either the samples are noiseless or the noise is bounded. However, if the noise is unbounded, which is a common scenario, the tangent space estimation at the noisy samples will be blurred. Fitting a manifold from the blurred tangent space might increase the inaccuracy. In this paper, we introduce a new manifold-fitting method, by which the output manifold is constructed by directly estimating the tangent spaces at the projected points on the underlying manifold, rather than at the sample points, to decrease the error caused by the noise. Assuming the noise is unbounded, our new method provides theoretical convergence in high probability, in terms of the upper bound of the distance between the estimated and underlying manifold. The smoothness of the estimated manifold is also evaluated by bounding the supremum of twice difference above. Numerical simulations are provided to validate our theoretical findings and demonstrate the advantages of our method over other relevant manifold fitting methods. Finally, our method is applied to real data examples.

Manifold Fitting under Unbounded Noise

TL;DR

Abstract

Manifold Fitting under Unbounded Noise

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (31)