Scaling Limit: Exact and Tractable Analysis of Online Learning Algorithms with Applications to Regularized Regression and PCA

Chuang Wang; Jonathan Mattingly; Yue M. Lu

Scaling Limit: Exact and Tractable Analysis of Online Learning Algorithms with Applications to Regularized Regression and PCA

Chuang Wang, Jonathan Mattingly, Yue M. Lu

TL;DR

The paper develops an exact, high-dimensional framework to analyze the transient dynamics of online learning algorithms by showing that the time-evolving joint empirical measures converge to a deterministic measure-valued process governed by nonlinear PDEs.Applying this framework to online regularized regression and online PCA reveals precise PDE characterizations of the dynamics, including a decoupled 1-D effective coordinate behavior under exchangeability, and provides tractable numerical methods for PDE solutions to predict algorithm performance.Central contributions include a general meta-theorem for exchangeable Markov chains, weak-convergence results to measure-valued PDEs, and a practical interpretation of dynamics as 1-D stochastic gradient actions in effective energy landscapes, with implications for nonconvex optimization and adaptive learning.

Abstract

We present a framework for analyzing the exact dynamics of a class of online learning algorithms in the high-dimensional scaling limit. Our results are applied to two concrete examples: online regularized linear regression and principal component analysis. As the ambient dimension tends to infinity, and with proper time scaling, we show that the time-varying joint empirical measures of the target feature vector and its estimates provided by the algorithms will converge weakly to a deterministic measured-valued process that can be characterized as the unique solution of a nonlinear PDE. Numerical solutions of this PDE can be efficiently obtained. These solutions lead to precise predictions of the performance of the algorithms, as many practical performance metrics are linear functionals of the joint empirical measures. In addition to characterizing the dynamic performance of online learning algorithms, our asymptotic analysis also provides useful insights. In particular, in the high-dimensional limit, and due to exchangeability, the original coupled dynamics associated with the algorithms will be asymptotically "decoupled", with each coordinate independently solving a 1-D effective minimization problem via stochastic gradient descent. Exploiting this insight for nonconvex optimization problems may prove an interesting line of future research.

Scaling Limit: Exact and Tractable Analysis of Online Learning Algorithms with Applications to Regularized Regression and PCA

TL;DR

Abstract

Scaling Limit: Exact and Tractable Analysis of Online Learning Algorithms with Applications to Regularized Regression and PCA

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (37)