Generalized Robust Adaptive-Bandwidth Multi-View Manifold Learning in High Dimensions with Noise
Xiucai Ding, Chao Shen, Hau-Tieng Wu
TL;DR
GRAB-MDM addresses robust multiview fusion under high-dimensional noise by introducing view-dependent bandwidths within a diffusion-map framework. It constructs a block kernel with a global diffusion operator across $K$ views and performs a two-stage bandwidth selection with per-view scales $h_\ell$ and a global factor $c$, yielding a joint embedding that leverages cross-view geometry. Theoretical analysis shows the limiting operator is a mixture of Laplace-Beltrami operators with view-specific lower-order terms; robustness in high-dimensional regimes is established, extending diffusion-map guarantees to more than two views. Empirically, GRAB-MDM consistently outperforms fixed-bandwidth baselines and a suite of multiview methods in spectral clustering and manifold-learning tasks, demonstrating improved embedding quality and clustering stability in noisy, high-dimensional settings.
Abstract
Multiview datasets are common in scientific and engineering applications, yet existing fusion methods offer limited theoretical guarantees, particularly in the presence of heterogeneous and high-dimensional noise. We propose Generalized Robust Adaptive-Bandwidth Multiview Diffusion Maps (GRAB-MDM), a new kernel-based diffusion geometry framework for integrating multiple noisy data sources. The key innovation of GRAB-MDM is a {view}-dependent bandwidth selection strategy that adapts to the geometry and noise level of each view, enabling a stable and principled construction of multiview diffusion operators. Under a common-manifold model, we establish asymptotic convergence results and show that the adaptive bandwidths lead to provably robust recovery of the shared intrinsic structure, even when noise levels and sensor dimensions differ across views. Numerical experiments demonstrate that GRAB-MDM significantly improves robustness and embedding quality compared with fixed-bandwidth and equal-bandwidth baselines, and usually outperform existing algorithms. The proposed framework offers a practical and theoretically grounded solution for multiview sensor fusion in high-dimensional noisy environments.
