Beyond Nonconvexity: A Universal Trust-Region Method with New Analyses

Yuntian Jiang; Chang He; Chuwen Zhang; Dongdong Ge; Bo Jiang; Yinyu Ye

Beyond Nonconvexity: A Universal Trust-Region Method with New Analyses

Yuntian Jiang, Chang He, Chuwen Zhang, Dongdong Ge, Bo Jiang, Yinyu Ye

TL;DR

This work introduces a universal trust-region method (UTR) that combines gradient-regularized local models with a ball constraint, enabling a unified nonconvex and convex analysis. The key device is a Function-or-Stationarity-Decrease (FOSD) property, which ensures either sufficient objective decrease or gradient contraction, allowing concise global guarantees without intricate inner loops. With a simple strategy for selecting $(\sigma_k, r_k)$, UTR achieves $\tilde{O}(\epsilon^{-3/2})$ iterations for finding $\epsilon$-SOSPs in nonconvex problems and $O(\epsilon^{-1/2})$ in the convex setting, with an adaptive variant that removes the dependence on the Lipschitz constant in practice. Numerical experiments on CUTEst, logistic regression, and matrix completion demonstrate strong empirical performance of the adaptive variant, often surpassing existing TR and Newton-type methods and closing the gap between theory and practice for second-order optimization.

Abstract

The trust-region (TR) method is renowned historically for its robustness in nonconvex problems and extraordinary numerical performance, but the study of its performance in convex optimization is somehow limited. This paper complements the existing literature by presenting a universal trust-region method that simultaneously incorporates the quadratic regularization and ball constraint. In particular, we introduce a novel descent property tailored for trust-region-type algorithms, enabling us to unify and streamline the analysis for both convex and nonconvex optimization. Our method exhibits an iteration complexity of $\tilde O(ε^{-3/2})$ to find an $ε$-approximate second-order stationary point for nonconvex optimization. Meanwhile, the analysis reveals that the universal method attains an $O(ε^{-1/2})$ complexity bound for convex optimization. Finally, we develop an adaptive universal method to address practical implementations. The numerical results show the effectiveness of our method in both nonconvex and convex problems.

Beyond Nonconvexity: A Universal Trust-Region Method with New Analyses

TL;DR

, UTR achieves

iterations for finding

-SOSPs in nonconvex problems and

in the convex setting, with an adaptive variant that removes the dependence on the Lipschitz constant in practice. Numerical experiments on CUTEst, logistic regression, and matrix completion demonstrate strong empirical performance of the adaptive variant, often surpassing existing TR and Newton-type methods and closing the gap between theory and practice for second-order optimization.

Abstract

to find an

-approximate second-order stationary point for nonconvex optimization. Meanwhile, the analysis reveals that the universal method attains an

complexity bound for convex optimization. Finally, we develop an adaptive universal method to address practical implementations. The numerical results show the effectiveness of our method in both nonconvex and convex problems.

Paper Structure (24 sections, 23 theorems, 98 equations, 1 figure, 4 tables, 2 algorithms)

This paper contains 24 sections, 23 theorems, 98 equations, 1 figure, 4 tables, 2 algorithms.

Introduction
Contributions.
The Universal Trust-Region Method
Overview of the Method
Basic Properties of the Method
Basic Principle of Choosing $(\sigma_k, r_k)$
The Universal Trust-Region Method with a Simple Strategy
Global Convergence Rate for Nonconvex Optimization
Minimizing Convex Functions
Local Convergence
The Adaptive Universal Trust-Region Method
The Adaptive Framework
Global Convergence
Nonconvex Functions
Convex Functions
...and 9 more sections

Key Result

Lemma 2.1

If $f:\mathbb{R}^n \mapsto \mathbb{R}$ satisfies assm.lipschitz, then for all $x,y\in \mathbb{R}^n$, we have

Figures (1)

Figure 1: Logistic regression on LIVSVM instances: a4a (left) and w8a (right).

Theorems & Definitions (42)

Definition 2.1
Lemma 2.1: Lemma 4.1.1, nesterov_lectures_2018
Lemma 2.2
Lemma 2.3
proof
Lemma 2.4
proof
Corollary 2.1
proof
Lemma 3.1
...and 32 more

Beyond Nonconvexity: A Universal Trust-Region Method with New Analyses

TL;DR

Abstract

Beyond Nonconvexity: A Universal Trust-Region Method with New Analyses

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (42)