Table of Contents
Fetching ...

A Homogenization Approach for Gradient-Dominated Stochastic Optimization

Jiyuan Tan, Chenyu Xue, Chuwen Zhang, Qi Deng, Dongdong Ge, Yinyu Ye

TL;DR

The paper studies stochastic non-convex optimization under gradient-dominance with index $\alpha\in[1,2]$ and introduces SHSODM, a homogenization-based second-order method that replaces cubic-regularized subproblems with an eigenvalue-based subproblem, yielding a per-iteration cost of $\tilde{O}(n^2)$. It proves that SHSODM attains competitive, and in some regimes optimal, sample complexity comparable to SCRN, with variance-reduction variants further improving rates to $O(\epsilon^{-2/\alpha})$ for $\alpha\in[1,3/2)$, while avoiding expensive linear systems. The method relies on adaptive line-search over the augmented matrix parameter $\delta_k$ and a gradient perturbation to handle degenerate directions, and uses Lanczos-type solvers to compute the leftmost eigenpair. Empirically, SHSODM demonstrates superior performance and robustness in reinforcement learning tasks against SCRN and first-order baselines, highlighting its practicality for ill-conditioned problems. The work broadens the applicability of homogenization techniques to gradient-dominated stochastic optimization and offers a scalable route for second-order methods in large-scale problems.

Abstract

Gradient dominance property is a condition weaker than strong convexity, yet sufficiently ensures global convergence even in non-convex optimization. This property finds wide applications in machine learning, reinforcement learning (RL), and operations management. In this paper, we propose the stochastic homogeneous second-order descent method (SHSODM) for stochastic functions enjoying gradient dominance property based on a recently proposed homogenization approach. Theoretically, we provide its sample complexity analysis, and further present an enhanced result by incorporating variance reduction techniques. Our findings show that SHSODM matches the best-known sample complexity achieved by other second-order methods for gradient-dominated stochastic optimization but without cubic regularization. Empirically, since the homogenization approach only relies on solving extremal eigenvector problem at each iteration instead of Newton-type system, our methods gain the advantage of cheaper computational cost and robustness in ill-conditioned problems. Numerical experiments on several RL tasks demonstrate the better performance of SHSODM compared to other off-the-shelf methods.

A Homogenization Approach for Gradient-Dominated Stochastic Optimization

TL;DR

The paper studies stochastic non-convex optimization under gradient-dominance with index and introduces SHSODM, a homogenization-based second-order method that replaces cubic-regularized subproblems with an eigenvalue-based subproblem, yielding a per-iteration cost of . It proves that SHSODM attains competitive, and in some regimes optimal, sample complexity comparable to SCRN, with variance-reduction variants further improving rates to for , while avoiding expensive linear systems. The method relies on adaptive line-search over the augmented matrix parameter and a gradient perturbation to handle degenerate directions, and uses Lanczos-type solvers to compute the leftmost eigenpair. Empirically, SHSODM demonstrates superior performance and robustness in reinforcement learning tasks against SCRN and first-order baselines, highlighting its practicality for ill-conditioned problems. The work broadens the applicability of homogenization techniques to gradient-dominated stochastic optimization and offers a scalable route for second-order methods in large-scale problems.

Abstract

Gradient dominance property is a condition weaker than strong convexity, yet sufficiently ensures global convergence even in non-convex optimization. This property finds wide applications in machine learning, reinforcement learning (RL), and operations management. In this paper, we propose the stochastic homogeneous second-order descent method (SHSODM) for stochastic functions enjoying gradient dominance property based on a recently proposed homogenization approach. Theoretically, we provide its sample complexity analysis, and further present an enhanced result by incorporating variance reduction techniques. Our findings show that SHSODM matches the best-known sample complexity achieved by other second-order methods for gradient-dominated stochastic optimization but without cubic regularization. Empirically, since the homogenization approach only relies on solving extremal eigenvector problem at each iteration instead of Newton-type system, our methods gain the advantage of cheaper computational cost and robustness in ill-conditioned problems. Numerical experiments on several RL tasks demonstrate the better performance of SHSODM compared to other off-the-shelf methods.
Paper Structure (19 sections, 13 theorems, 81 equations, 7 figures, 2 tables, 4 algorithms)

This paper contains 19 sections, 13 theorems, 81 equations, 7 figures, 2 tables, 4 algorithms.

Key Result

Lemma 2.1

Denote by $[{v}_k;t_k]$ the optimal solution to problem eq:homo-model. We have:

Figures (7)

  • Figure 1: HalfCheetah-v2
  • Figure 2: Walker2d-v2
  • Figure 3: Humanoid-v2
  • Figure 4: Hopper-v2
  • Figure 6: The $x$-axis represents three different tested environments, including HalfCheetah-v2, Hopper-v2, and Walker2d-v2. The $y$-axis presents the total time required to obtain the update direction in $10^3$ epochs.
  • ...and 2 more figures

Theorems & Definitions (25)

  • Lemma 2.1: zhang2022homogenous, zhang2022homogenous
  • Remark 2.1
  • Theorem 3.1: Finite-step Termination of \ref{['algo:ls']}
  • Remark 4.1
  • Theorem 4.1: Sample Complexity of SHSODM
  • Corollary 4.1: Informal, Sample Complexity for Policy-Based RL
  • Corollary 4.2: Deterministic Setting
  • Theorem 4.2: Sample Complexity of VR-SHSODM
  • Lemma C.1
  • proof
  • ...and 15 more