A Homogenization Approach for Gradient-Dominated Stochastic Optimization

Jiyuan Tan; Chenyu Xue; Chuwen Zhang; Qi Deng; Dongdong Ge; Yinyu Ye

A Homogenization Approach for Gradient-Dominated Stochastic Optimization

Jiyuan Tan, Chenyu Xue, Chuwen Zhang, Qi Deng, Dongdong Ge, Yinyu Ye

TL;DR

The paper studies stochastic non-convex optimization under gradient-dominance with index $\alpha\in[1,2]$ and introduces SHSODM, a homogenization-based second-order method that replaces cubic-regularized subproblems with an eigenvalue-based subproblem, yielding a per-iteration cost of $\tilde{O}(n^2)$. It proves that SHSODM attains competitive, and in some regimes optimal, sample complexity comparable to SCRN, with variance-reduction variants further improving rates to $O(\epsilon^{-2/\alpha})$ for $\alpha\in[1,3/2)$, while avoiding expensive linear systems. The method relies on adaptive line-search over the augmented matrix parameter $\delta_k$ and a gradient perturbation to handle degenerate directions, and uses Lanczos-type solvers to compute the leftmost eigenpair. Empirically, SHSODM demonstrates superior performance and robustness in reinforcement learning tasks against SCRN and first-order baselines, highlighting its practicality for ill-conditioned problems. The work broadens the applicability of homogenization techniques to gradient-dominated stochastic optimization and offers a scalable route for second-order methods in large-scale problems.

Abstract

Gradient dominance property is a condition weaker than strong convexity, yet sufficiently ensures global convergence even in non-convex optimization. This property finds wide applications in machine learning, reinforcement learning (RL), and operations management. In this paper, we propose the stochastic homogeneous second-order descent method (SHSODM) for stochastic functions enjoying gradient dominance property based on a recently proposed homogenization approach. Theoretically, we provide its sample complexity analysis, and further present an enhanced result by incorporating variance reduction techniques. Our findings show that SHSODM matches the best-known sample complexity achieved by other second-order methods for gradient-dominated stochastic optimization but without cubic regularization. Empirically, since the homogenization approach only relies on solving extremal eigenvector problem at each iteration instead of Newton-type system, our methods gain the advantage of cheaper computational cost and robustness in ill-conditioned problems. Numerical experiments on several RL tasks demonstrate the better performance of SHSODM compared to other off-the-shelf methods.

A Homogenization Approach for Gradient-Dominated Stochastic Optimization

TL;DR

The paper studies stochastic non-convex optimization under gradient-dominance with index

and introduces SHSODM, a homogenization-based second-order method that replaces cubic-regularized subproblems with an eigenvalue-based subproblem, yielding a per-iteration cost of

. It proves that SHSODM attains competitive, and in some regimes optimal, sample complexity comparable to SCRN, with variance-reduction variants further improving rates to

for

, while avoiding expensive linear systems. The method relies on adaptive line-search over the augmented matrix parameter

and a gradient perturbation to handle degenerate directions, and uses Lanczos-type solvers to compute the leftmost eigenpair. Empirically, SHSODM demonstrates superior performance and robustness in reinforcement learning tasks against SCRN and first-order baselines, highlighting its practicality for ill-conditioned problems. The work broadens the applicability of homogenization techniques to gradient-dominated stochastic optimization and offers a scalable route for second-order methods in large-scale problems.

Abstract

Paper Structure (19 sections, 13 theorems, 81 equations, 7 figures, 2 tables, 4 algorithms)

This paper contains 19 sections, 13 theorems, 81 equations, 7 figures, 2 tables, 4 algorithms.

Introduction
Related Work
Preliminaries
Customized Strategies
SHSODM for Gradient-Dominated Stochastic Optimization
SHSODM
SHSODM with Variance Reduction
Numerical Experiments
Conclusion
A more comprehensive literature review
Proof Sketch
Technical Results for Theorem \ref{['lemma:ls_err']} on the Adaptive Search of $\delta_k$
Proof of \ref{['lemma:ls_err']}
Proof of Sample Complexity Results of Algorithm \ref{['algo:hsodm']}
Proof of Theorem \ref{['thm:hsodm']}
...and 4 more sections

Key Result

Lemma 2.1

Denote by $[{v}_k;t_k]$ the optimal solution to problem eq:homo-model. We have:

Figures (7)

Figure 1: HalfCheetah-v2
Figure 2: Walker2d-v2
Figure 3: Humanoid-v2
Figure 4: Hopper-v2
Figure 6: The $x$-axis represents three different tested environments, including HalfCheetah-v2, Hopper-v2, and Walker2d-v2. The $y$-axis presents the total time required to obtain the update direction in $10^3$ epochs.
...and 2 more figures

Theorems & Definitions (25)

Lemma 2.1: zhang2022homogenous, zhang2022homogenous
Remark 2.1
Theorem 3.1: Finite-step Termination of \ref{['algo:ls']}
Remark 4.1
Theorem 4.1: Sample Complexity of SHSODM
Corollary 4.1: Informal, Sample Complexity for Policy-Based RL
Corollary 4.2: Deterministic Setting
Theorem 4.2: Sample Complexity of VR-SHSODM
Lemma C.1
proof
...and 15 more

A Homogenization Approach for Gradient-Dominated Stochastic Optimization

TL;DR

Abstract

A Homogenization Approach for Gradient-Dominated Stochastic Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (25)