High-dimensional online learning via asynchronous decomposition: Non-divergent results, dynamic regularization, and beyond

Shixiang Liu; Zhifan Li; Hanming Yang; Jianxin Yin

High-dimensional online learning via asynchronous decomposition: Non-divergent results, dynamic regularization, and beyond

Shixiang Liu, Zhifan Li, Hanming Yang, Jianxin Yin

Abstract

Existing high-dimensional online learning methods often face the challenge that their error bounds, or per-batch sample sizes, diverge as the number of data batches increases. To address this issue, we propose an asynchronous decomposition framework that leverages summary statistics to construct a surrogate score function for current-batch learning. This framework is implemented via a dynamic-regularized iterative hard thresholding algorithm, providing a computationally and memory-efficient solution for sparse online optimization. We provide a unified theoretical analysis that accounts for both the streaming computational error and statistical accuracy, establishing that our estimator maintains non-divergent error bounds and $\ell_0$ sparsity across all batches. Furthermore, the proposed estimator adaptively achieves additional gains as batches accumulate, attaining the oracle accuracy as if the entire historical dataset were accessible and the true support were known. These theoretical properties are further illustrated through an example of the generalized linear model.

High-dimensional online learning via asynchronous decomposition: Non-divergent results, dynamic regularization, and beyond

Abstract

sparsity across all batches. Furthermore, the proposed estimator adaptively achieves additional gains as batches accumulate, attaining the oracle accuracy as if the entire historical dataset were accessible and the true support were known. These theoretical properties are further illustrated through an example of the generalized linear model.

Paper Structure (40 sections, 8 theorems, 107 equations, 1 figure, 1 algorithm)

This paper contains 40 sections, 8 theorems, 107 equations, 1 figure, 1 algorithm.

Introduction
Related work and marginal contribution
Upper bounds in High-dimensional online learning
Influence of signal strength
Optimization error and statistical error
Main contributions
Notation
Methodology
Preliminary
Revisit high-dimensional renewable learning
Renewable learning
Limitation
Asynchronous decomposition framework
Storage
Algorithm implementation
...and 25 more sections

Key Result

Proposition 1

Suppose $f_b$ satisfies RSS$(m, M, (2C_s + 1)s )$ for $b=1,2$, and $f_1$ satisfies RGS$\left( L_1, C_e \sqrt s \lambda_\beta^{(1,\infty)}, (C_s +1)s \right)$. Assume Assumption assump: N1 holds with $C_p = \frac{4C_e^2 C_\beta}{m+M}$. For $b = 1, 2$, let the learning rate $\eta_b \in \left[ \frac{1} then the following $\ell_2$ and $\ell_0$ bounds hold: where $C_\beta := \frac{8}{m+M} \cdot \frac{

Figures (1)

Figure 1: Relationship between the $\ell_2$ error and the batch number $b$ in online learning of high-dimensional GLMs, where for ease of display we assume each batch contains $n$ samples, so that $N_b = \sum_{j=1}^b n_j = nb$. The $\log b$ term is introduced to ensure the error bounds hold uniformly for all batches $b \ge 1$.

Theorems & Definitions (12)

Remark 1
Proposition 1: Burn-in
Theorem 1: General batch
Theorem 2: Sharper bound
Theorem 3: Streaming GLM
Theorem 4: Oracle accuracy and support recovery
Lemma 1: Designs
proof : Proof of Lemma \ref{['le: rip and max']}
Lemma 2: Sub-Gaussian errors
proof : Proof of Lemma \ref{['lemma: subgaussian']}
...and 2 more

High-dimensional online learning via asynchronous decomposition: Non-divergent results, dynamic regularization, and beyond

Abstract

High-dimensional online learning via asynchronous decomposition: Non-divergent results, dynamic regularization, and beyond

Authors

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (12)