Expectation-maximization for low-SNR multi-reference alignment
Amnon Balanov, Wasim Huleihel, Tamir Bendory
TL;DR
This work analyzes the EM algorithm for multi-reference alignment in the challenging low-SNR regime, revealing a two-phase local convergence near the ground truth and a fundamental iteration bottleneck that scales as $T \gtrsim \mathrm{SNR}^{-2}$. It uncovers initialization-driven biases (Einstein from Noise) and finite-sample instabilities (Ghost of Newton), showing that Fourier phases can persist from initialization while magnitudes decay slowly, and that phase drift accumulates with sample size and iterations. The paper provides a detailed population-level Jacobian analysis in the low-SNR limit, a spectral decomposition in Fourier space, and a precise iteration-complexity bound, complemented by a thorough finite-sample analysis that identifies sample-size thresholds $n \gtrsim \mathrm{SNR}^{-3}$ and phase-drift phenomena $\propto 1/n$. Collectively, these results expose fundamental computational and initialization limitations of EM for MRA in practical, noisy settings and suggest mitigation directions, including mini-batching and potential second-order methods. The findings have implications for real-world applications in cryo-EM and related latent-group models where alignment under noise is essential yet challenging.
Abstract
We study the multi-reference alignment (MRA) problem of recovering a signal from noisy observations acted on by unknown random circular shifts. While the information-theoretic limits of MRA are well characterized in many settings, the algorithmic behavior at low signal-to-noise ratio (SNR), the regime of practical interest, remains poorly understood. In this paper, we analyze the expectation-maximization (EM) algorithm, a widely used method for MRA, and characterize its convergence dynamics and initialization dependence in the low-SNR limit. On the convergence side, we prove a two-phase phenomenon near the ground truth as $\mathrm{SNR}\to 0$: an initial contraction with error decaying as $\exp(-\, \mathrm{SNR} \cdot t)$ followed by a much slower phase scaling as $\exp(- \,\mathrm{SNR}^2 \cdot t)$, where $t$ is the iteration number. This yields an iteration-complexity lower bound $T \gtrsim \mathrm{SNR}^{-2}$ to reach a small fixed target accuracy, revealing a severe computational bottleneck at low SNR. We also identify a finite-sample instability, which we term \emph{Ghost of Newton}, in which EM initially approaches the ground truth but later diverges, degrading reconstruction quality. On the bias side, we analyze EM in the noise-only setting ($\mathrm{SNR}=0$), a regime referred to as Einstein from Noise, to highlight its pronounced sensitivity to initialization. We prove that the EM map preserves the Fourier phases of the initialization across all iterations, while the corresponding Fourier magnitudes contract toward zero at a slow rate of $(1+T)^{-1/2}$. Consequently, although the amplitudes vanish in the limit of $T \to \infty$ iterations, the reconstructed structure continues to reflect the geometry encoded by the template's Fourier phases. Together, these results expose fundamental computational and initialization-driven limitations of EM for MRA in the low-SNR regime.
