Unveiling the Cycloid Trajectory of EM Iterations in Mixed Linear Regression

Zhankun Luo; Abolfazl Hashemi

Unveiling the Cycloid Trajectory of EM Iterations in Mixed Linear Regression

Zhankun Luo, Abolfazl Hashemi

TL;DR

This work analyzes EM convergence for the two-component mixed linear regression (2MLR) with unlabeled data. It derives explicit population EM updates across all SNR using Bessel functions, and reveals that in the noiseless setting the EM iterates follow a cycloid trajectory within the span of the initialization and the truth, enabling a precise recurrence for the sub-optimality angle. The authors prove a transition from linear to quadratic convergence and establish finite-sample error bounds for regression parameters and mixing weights, with a three-stage convergence scheme and minimal dependence on mixing weights. Empirical results validate the cycloid trajectory, show robust quadratic convergence at high SNR, and demonstrate that the regression update is largely independent of the true mixing weights, supporting the practicality of the theory and suggesting avenues for extensions to weak separation and more components.

Abstract

We study the trajectory of iterations and the convergence rates of the Expectation-Maximization (EM) algorithm for two-component Mixed Linear Regression (2MLR). The fundamental goal of MLR is to learn the regression models from unlabeled observations. The EM algorithm finds extensive applications in solving the mixture of linear regressions. Recent results have established the super-linear convergence of EM for 2MLR in the noiseless and high SNR settings under some assumptions and its global convergence rate with random initialization has been affirmed. However, the exponent of convergence has not been theoretically estimated and the geometric properties of the trajectory of EM iterations are not well-understood. In this paper, first, using Bessel functions we provide explicit closed-form expressions for the EM updates under all SNR regimes. Then, in the noiseless setting, we completely characterize the behavior of EM iterations by deriving a recurrence relation at the population level and notably show that all the iterations lie on a certain cycloid. Based on this new trajectory-based analysis, we exhibit the theoretical estimate for the exponent of super-linear convergence and further improve the statistical error bound at the finite-sample level. Our analysis provides a new framework for studying the behavior of EM for Mixed Linear Regression.

Unveiling the Cycloid Trajectory of EM Iterations in Mixed Linear Regression

TL;DR

Abstract

Paper Structure (21 sections, 51 theorems, 221 equations, 4 figures)

This paper contains 21 sections, 51 theorems, 221 equations, 4 figures.

Introduction
Problem Setup
Population EM Updates
Population Level Analysis
Finite-sample Level Analysis
Experiments
Conclusion
Appendix
Lemmas: Integrals, Convolutions, Expectations
Relations between theta*, theta and e1, e2, e1, e2
Integrals and Expectation with Gaussian
integrals with Gaussian
expectations with Gaussian
expectations for 2MLR
Derivations for EM Update Rules
...and 6 more sections

Key Result

theorem 5

(EM Updates across All SNR) Let $\rho := \frac{\langle \theta, \theta^{\ast} \rangle}{\| \theta \| \cdot \| \theta^{\ast} \|}, \bar{\theta} := \frac{\theta}{\sigma}, \bar{\theta}^{\ast} := \frac{\theta^{\ast}}{\sigma}$, then the EM update rules for $\theta, \tanh(\nu)$ at Population level are where these coefficients are defined as

Figures (4)

Figure 1: The EM update $M(\theta, \nu)$ for regression parameters lies on span$\{\theta, \theta^\ast\}$.
Figure 2: The cycloid trajectory for the EM update $M(\theta, \nu)$ of regression parameters $\theta$. The figure further shows the two global solutions (red dots), the unstable solution (blue dot), and the two saddle points (green dots). As long as the initial suboptimality angle is sufficiently large, $\varphi^t$ and in turn $\theta^t$ super-linearly converge to $\frac{\pi}{2}$ and $\theta^\ast$.
Figure 3: Cycloid trajectory of EM iterations $\theta^t$-- we perform 100 iterations of Finite-sample EM with SNR=$10^8$, varying dimensions ($d=2,3,50$).
Figure 4: Left and Middle: Quadratic convergence and correlation are shown with $\theta^\ast, \theta^0$ from $d=50$ unit sphere, s.t. $\varphi^0 = \arctan(1.5)$ in Panel (a), $\varphi^0 = 0.3$ in Panel (b). Right: The errors of regression parameters and mixing weights for ten EM iterations, with $d=50, \varphi^0=0.3$, SNR=$10^8$ and different true mixing weights $\pi^\ast=\{0.6, 0.4\},\{0.8, 0.2\}, \{1, 0\}$.

Theorems & Definitions (88)

theorem 5
corollary 6
corollary 7
theorem 8
corollary 9
proposition 10
proposition 11
proposition 12
theorem 13
proposition 14
...and 78 more

Unveiling the Cycloid Trajectory of EM Iterations in Mixed Linear Regression

TL;DR

Abstract

Unveiling the Cycloid Trajectory of EM Iterations in Mixed Linear Regression

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (88)