Table of Contents
Fetching ...

Minimizing smooth Kurdyka-Łojasiewicz functions via generalized descent methods: Convergence rate and complexity

Masoud Ahookhosh, Susan Ghaderi, Alireza Kabgani, Morteza Rahimi

TL;DR

This work introduces the generalized descent algorithm (DEAL) for unconstrained minimization of $C^1$ (potentially nonconvex) functions under the Kurdyka–Łojasiewicz (KL) property. By enforcing a generalized descent condition $f(x^{k+1})\le f(x^k)-\rho\|\nabla f(x^k)\|^{\theta}$, DEAL unifies constant step-size, Armijo, proximal-gradient, and high-order proximal-point schemes, and yields global convergence, rate results, and iteration-complexity bounds. A key finding is that linear convergence is achieved when the KL exponent $\vartheta$ and descent order satisfy $\theta=1/\vartheta$, with boosted high-order proximal-point methods attaining global linear convergence for any $\vartheta$ by appropriate regularization order. The analysis provides a simple, broadly applicable KL-based convergence framework that extends to nonsmooth settings via smoothing techniques (e.g., HOME) and demonstrates strong practical performance on inverse problems and LASSO through numerical experiments. Overall, the paper offers a versatile, theoretically grounded toolkit for designing descent algorithms with predictable convergence behavior in KL-structured optimization problems.

Abstract

This paper addresses the generalized descent algorithm (DEAL) for minimizing smooth functions, which is analyzed under the Kurdyka-Łojasiewicz (KL) inequality. In particular, the suggested algorithm guarantees a sufficient decrease by adapting to the cost function's geometry. We leverage the KL property to establish the global convergence, convergence rates, and complexity. A particular focus is placed on the linear convergence of generalized descent methods. We show that the constant step-size and Armijo line search strategies along a generalized descent direction satisfy our generalized descent condition. Additionally, for nonsmooth functions by leveraging the smoothing techniques such as forward-backward and high-order Moreau envelopes, we show that the boosted proximal gradient method (BPGA) and the boosted high-order proximal-point (BPPA) methods are also specific cases of DEAL, respectively. It is notable that if the order of the high-order proximal term is chosen in a certain way (depending on the KL exponent), then the sequence generated by BPPA converges linearly for an arbitrary KL exponent. Our preliminary numerical experiments on inverse problems and LASSO demonstrate the efficiency of the proposed methods, validating our theoretical findings.

Minimizing smooth Kurdyka-Łojasiewicz functions via generalized descent methods: Convergence rate and complexity

TL;DR

This work introduces the generalized descent algorithm (DEAL) for unconstrained minimization of (potentially nonconvex) functions under the Kurdyka–Łojasiewicz (KL) property. By enforcing a generalized descent condition , DEAL unifies constant step-size, Armijo, proximal-gradient, and high-order proximal-point schemes, and yields global convergence, rate results, and iteration-complexity bounds. A key finding is that linear convergence is achieved when the KL exponent and descent order satisfy , with boosted high-order proximal-point methods attaining global linear convergence for any by appropriate regularization order. The analysis provides a simple, broadly applicable KL-based convergence framework that extends to nonsmooth settings via smoothing techniques (e.g., HOME) and demonstrates strong practical performance on inverse problems and LASSO through numerical experiments. Overall, the paper offers a versatile, theoretically grounded toolkit for designing descent algorithms with predictable convergence behavior in KL-structured optimization problems.

Abstract

This paper addresses the generalized descent algorithm (DEAL) for minimizing smooth functions, which is analyzed under the Kurdyka-Łojasiewicz (KL) inequality. In particular, the suggested algorithm guarantees a sufficient decrease by adapting to the cost function's geometry. We leverage the KL property to establish the global convergence, convergence rates, and complexity. A particular focus is placed on the linear convergence of generalized descent methods. We show that the constant step-size and Armijo line search strategies along a generalized descent direction satisfy our generalized descent condition. Additionally, for nonsmooth functions by leveraging the smoothing techniques such as forward-backward and high-order Moreau envelopes, we show that the boosted proximal gradient method (BPGA) and the boosted high-order proximal-point (BPPA) methods are also specific cases of DEAL, respectively. It is notable that if the order of the high-order proximal term is chosen in a certain way (depending on the KL exponent), then the sequence generated by BPPA converges linearly for an arbitrary KL exponent. Our preliminary numerical experiments on inverse problems and LASSO demonstrate the efficiency of the proposed methods, validating our theoretical findings.

Paper Structure

This paper contains 16 sections, 15 theorems, 79 equations, 2 figures, 5 algorithms.

Key Result

Theorem 3.2

Let the sequence $\seq{x^k}$ be generated by Algorithm alg:deal. The following assertions hold:

Figures (2)

  • Figure 1: Convergence of loss function (log scale) and gradient norm over iterations. First row: DEAL-C and its heuristic versions DEAL-C1, DEAL-C2, DEAL-C3. Second row: DEAL-A and its variants DEAL-A1, DEAL-A2, DEAL-A3. Third row: Comparison between DEAL-A, DEAL-C, DEAL-C3, and DEAL-A3, the best heuristic variants.
  • Figure 2: Convergence of loss function (log scale) and gradient norm over iterations for BPGA, BPGA-1, BPGA-BB1, BPGA-BB2, and BPGA-LBFGS over 1000 iterations.

Theorems & Definitions (41)

  • Definition 2.1: Weak convexity
  • Remark 2.3
  • Definition 2.4: Kurdyka-Łojasiewicz inequality
  • Remark 3.1
  • Theorem 3.2: Global convergence of DEAL
  • proof
  • Theorem 3.3: Linear convergence rate of DEAL
  • proof
  • Theorem 3.4: Global convergence of DEAL under global KL
  • proof
  • ...and 31 more