Riemannian Optimization for Non-convex Euclidean Distance Geometry with Global Recovery Guarantees
Chandler Smith, HanQin Cai, Abiy Tasissa
TL;DR
The paper tackles EDG by casting point-configuration recovery from partial distances as low-rank Gram-matrix completion on the rank-$r$ manifold. It introduces two non-convex Riemannian algorithms, one using a non-self-adjoint sampling operator $\mathcal{R}_{\Omega}$ and another employing a self-adjoint surrogate $\mathcal{F}_{\Omega}$, with provable local convergence under RIP-type conditions. Two initialization schemes—one-step hard-thresholding and a resampling-based approach with trimming—yield guarantees and improved sample complexity, including bounds on $m$ such as $m \ge O(n^{7/4} r^2 \log n)$ initially and $m \ge O(n r^2 \log n)$ with refinement. Numerical experiments on synthetic datasets and real protein-structure data show competitive performance, with overparameterization (rank above $r$) offering practical gains and the self-adjoint surrogate delivering strong results in practice. The work advances provable non-convex EDG recovery via a dual-basis Riemannian framework, with clear paths for extending to non-uniform sampling and broader bases.
Abstract
The problem of determining the configuration of points from partial distance information, known as the Euclidean Distance Geometry (EDG) problem, is fundamental to many tasks in the applied sciences. In this paper, we propose two algorithms grounded in the Riemannian optimization framework to address the EDG problem. Our approach formulates the problem as a low-rank matrix completion task over the Gram matrix, using partial measurements represented as expansion coefficients of the Gram matrix in a non-orthogonal basis. For the first algorithm, under a uniform sampling with replacement model for the observed distance entries, we demonstrate that, with high probability, a Riemannian gradient-like algorithm on the manifold of rank-$r$ matrices converges linearly to the true solution, given initialization via a one-step hard thresholding. This holds provided the number of samples, $m$, satisfies $m \geq \mathcal{O}(n^{7/4}r^2 \log(n))$. With a more refined initialization, achieved through resampled Riemannian gradient-like descent, we further improve this bound to $m \geq \mathcal{O}(nr^2 \log(n))$. Our analysis for the first algorithm leverages a non-self-adjoint operator and depends on deriving eigenvalue bounds for an inner product matrix of restricted basis matrices, leveraging sparsity properties for tighter guarantees than previously established. The second algorithm introduces a self-adjoint surrogate for the sampling operator. This algorithm demonstrates strong numerical performance on both synthetic and real data. Furthermore, we show that optimizing over manifolds of higher-than-rank-$r$ matrices yields superior numerical results, consistent with recent literature on overparameterization in the EDG problem.
