Table of Contents
Fetching ...

Step-Size Decay and Structural Stagnation in Greedy Sparse Learning

Pablo M. Berná

TL;DR

It is shown that over-decaying step-size schedules induce structural stagnation even in low-dimensional sparse settings with realizable regression problems with controlled feature coherence and derive explicit lower bounds on the residual norm.

Abstract

Greedy algorithms are central to sparse approximation and stage-wise learning methods such as matching pursuit and boosting. It is known that the Power-Relaxed Greedy Algorithm with step sizes $m^{-α}$ may fail to converge when $α>1$ in general Hilbert spaces. In this work, we revisit this phenomenon from a sparse learning perspective. We study realizable regression problems with controlled feature coherence and derive explicit lower bounds on the residual norm, showing that over-decaying step-size schedules induce structural stagnation even in low-dimensional sparse settings. Numerical experiments confirm the theoretical predictions and illustrate the role of feature coherence. Our results provide insight into step-size design in greedy sparse learning.

Step-Size Decay and Structural Stagnation in Greedy Sparse Learning

TL;DR

It is shown that over-decaying step-size schedules induce structural stagnation even in low-dimensional sparse settings with realizable regression problems with controlled feature coherence and derive explicit lower bounds on the residual norm.

Abstract

Greedy algorithms are central to sparse approximation and stage-wise learning methods such as matching pursuit and boosting. It is known that the Power-Relaxed Greedy Algorithm with step sizes may fail to converge when in general Hilbert spaces. In this work, we revisit this phenomenon from a sparse learning perspective. We study realizable regression problems with controlled feature coherence and derive explicit lower bounds on the residual norm, showing that over-decaying step-size schedules induce structural stagnation even in low-dimensional sparse settings. Numerical experiments confirm the theoretical predictions and illustrate the role of feature coherence. Our results provide insight into step-size design in greedy sparse learning.
Paper Structure (15 sections, 4 theorems, 72 equations, 2 figures)

This paper contains 15 sections, 4 theorems, 72 equations, 2 figures.

Key Result

Theorem 2.1

Consider the Euclidean space $(\mathbb R^n, \|\cdot\|_2)$. Let $\alpha>1$ and define $\lambda_m = m^{-\alpha}$. Let $x_1,x_2\in\mathbb R^n$ be unit vectors, $\|x_1\|_2=\|x_2\|_2=1$, with coherence Consider the symmetric dictionary and the realizable target Run the Power--Relaxed Greedy Algorithm (PRGA) over $\mathcal{D}$ with initialization $f_0=0$ and residual $r_0=y$. Then the residual cannot

Figures (2)

  • Figure 1: Minimum residual norm $\min_{1\le m \le M}\|r_m\|_2$ as a function of the coherence $\mu$ for $\alpha=1.1$ and $\alpha=1.5$. Solid lines correspond to the empirical PRGA performance, while dashed lines indicate the theoretical lower bound $b(1-\mu)\sqrt{\frac{1+\mu}{2}}\,P_\alpha$.
  • Figure :

Theorems & Definitions (9)

  • Theorem 2.1
  • proof
  • Proposition 2.2
  • proof
  • Example 2.3: Orthogonal $s$-sparse target
  • Lemma A.1
  • proof
  • Lemma A.2
  • proof