Table of Contents
Fetching ...

Beyond Nonconvexity: A Universal Trust-Region Method with New Analyses

Yuntian Jiang, Chang He, Chuwen Zhang, Dongdong Ge, Bo Jiang, Yinyu Ye

TL;DR

This work introduces a universal trust-region method (UTR) that combines gradient-regularized local models with a ball constraint, enabling a unified nonconvex and convex analysis. The key device is a Function-or-Stationarity-Decrease (FOSD) property, which ensures either sufficient objective decrease or gradient contraction, allowing concise global guarantees without intricate inner loops. With a simple strategy for selecting $(\sigma_k, r_k)$, UTR achieves $\tilde{O}(\epsilon^{-3/2})$ iterations for finding $\epsilon$-SOSPs in nonconvex problems and $O(\epsilon^{-1/2})$ in the convex setting, with an adaptive variant that removes the dependence on the Lipschitz constant in practice. Numerical experiments on CUTEst, logistic regression, and matrix completion demonstrate strong empirical performance of the adaptive variant, often surpassing existing TR and Newton-type methods and closing the gap between theory and practice for second-order optimization.

Abstract

The trust-region (TR) method is renowned historically for its robustness in nonconvex problems and extraordinary numerical performance, but the study of its performance in convex optimization is somehow limited. This paper complements the existing literature by presenting a universal trust-region method that simultaneously incorporates the quadratic regularization and ball constraint. In particular, we introduce a novel descent property tailored for trust-region-type algorithms, enabling us to unify and streamline the analysis for both convex and nonconvex optimization. Our method exhibits an iteration complexity of $\tilde O(ε^{-3/2})$ to find an $ε$-approximate second-order stationary point for nonconvex optimization. Meanwhile, the analysis reveals that the universal method attains an $O(ε^{-1/2})$ complexity bound for convex optimization. Finally, we develop an adaptive universal method to address practical implementations. The numerical results show the effectiveness of our method in both nonconvex and convex problems.

Beyond Nonconvexity: A Universal Trust-Region Method with New Analyses

TL;DR

This work introduces a universal trust-region method (UTR) that combines gradient-regularized local models with a ball constraint, enabling a unified nonconvex and convex analysis. The key device is a Function-or-Stationarity-Decrease (FOSD) property, which ensures either sufficient objective decrease or gradient contraction, allowing concise global guarantees without intricate inner loops. With a simple strategy for selecting , UTR achieves iterations for finding -SOSPs in nonconvex problems and in the convex setting, with an adaptive variant that removes the dependence on the Lipschitz constant in practice. Numerical experiments on CUTEst, logistic regression, and matrix completion demonstrate strong empirical performance of the adaptive variant, often surpassing existing TR and Newton-type methods and closing the gap between theory and practice for second-order optimization.

Abstract

The trust-region (TR) method is renowned historically for its robustness in nonconvex problems and extraordinary numerical performance, but the study of its performance in convex optimization is somehow limited. This paper complements the existing literature by presenting a universal trust-region method that simultaneously incorporates the quadratic regularization and ball constraint. In particular, we introduce a novel descent property tailored for trust-region-type algorithms, enabling us to unify and streamline the analysis for both convex and nonconvex optimization. Our method exhibits an iteration complexity of to find an -approximate second-order stationary point for nonconvex optimization. Meanwhile, the analysis reveals that the universal method attains an complexity bound for convex optimization. Finally, we develop an adaptive universal method to address practical implementations. The numerical results show the effectiveness of our method in both nonconvex and convex problems.
Paper Structure (24 sections, 23 theorems, 98 equations, 1 figure, 4 tables, 2 algorithms)

This paper contains 24 sections, 23 theorems, 98 equations, 1 figure, 4 tables, 2 algorithms.

Key Result

Lemma 2.1

If $f:\mathbb{R}^n \mapsto \mathbb{R}$ satisfies assm.lipschitz, then for all $x,y\in \mathbb{R}^n$, we have

Figures (1)

  • Figure 1: Logistic regression on LIVSVM instances: a4a (left) and w8a (right).

Theorems & Definitions (42)

  • Definition 2.1
  • Lemma 2.1: Lemma 4.1.1, nesterov_lectures_2018
  • Lemma 2.2
  • Lemma 2.3
  • proof
  • Lemma 2.4
  • proof
  • Corollary 2.1
  • proof
  • Lemma 3.1
  • ...and 32 more