Table of Contents
Fetching ...

A multilevel proximal trust-region method for nonsmooth optimization with applications

Robert Baraldi, Michael Hintermüller, Qi Wang

TL;DR

This work introduces RMNTR, a recursive multilevel proximal trust-region algorithm for minimizing nonsmooth composite objectives F=f+phi. By coupling a hierarchy of models across discretization levels with proximal updates, it achieves substantial speedups over single-level methods while retaining global convergence guarantees. The paper provides a global convergence proof and a SPG-based subproblem solver, and demonstrates effectiveness on PDE-constrained optimization and physics-informed neural networks. The results indicate that multilevel structure yields robust, scalable performance for high-dimensional nonsmooth problems in scientific computing and machine learning.

Abstract

Many large-scale optimization problems arising in science and engineering are naturally defined at multiple levels of discretization or model fidelity. Multilevel methods exploit this hierarchy to accelerate convergence by combining coarse- and fine-level information, a strategy that has proven highly effective in the numerical solution of partial differential equations and related optimization problems. It turns out that many applications in PDE-constrained optimization and data science require minimizing the sum of smooth and nonsmooth functions. For example, training neural networks may require minimizing a mean squared error plus an $L^1$-regularization to induce sparsity in the weights. Correspondingly, we introduce a multilevel proximal trust-region method to minimize the sum of a nonconvex, smooth and a convex, nonsmooth function. Exploiting ideas from the multilevel literature allows us to reduce the cost of the step computation, which is a major bottleneck in single level procedures. Our work unifies theory behind the proximal trust-region methods and multilevel recursive strategies. We prove global convergence of our method in finite dimensional space and provide an efficient nonsmooth subproblem solver. We show the efficiency and robustness of our algorithm by means of numerical examples in PDE constrained optimization and machine-learning.

A multilevel proximal trust-region method for nonsmooth optimization with applications

TL;DR

This work introduces RMNTR, a recursive multilevel proximal trust-region algorithm for minimizing nonsmooth composite objectives F=f+phi. By coupling a hierarchy of models across discretization levels with proximal updates, it achieves substantial speedups over single-level methods while retaining global convergence guarantees. The paper provides a global convergence proof and a SPG-based subproblem solver, and demonstrates effectiveness on PDE-constrained optimization and physics-informed neural networks. The results indicate that multilevel structure yields robust, scalable performance for high-dimensional nonsmooth problems in scientific computing and machine learning.

Abstract

Many large-scale optimization problems arising in science and engineering are naturally defined at multiple levels of discretization or model fidelity. Multilevel methods exploit this hierarchy to accelerate convergence by combining coarse- and fine-level information, a strategy that has proven highly effective in the numerical solution of partial differential equations and related optimization problems. It turns out that many applications in PDE-constrained optimization and data science require minimizing the sum of smooth and nonsmooth functions. For example, training neural networks may require minimizing a mean squared error plus an -regularization to induce sparsity in the weights. Correspondingly, we introduce a multilevel proximal trust-region method to minimize the sum of a nonconvex, smooth and a convex, nonsmooth function. Exploiting ideas from the multilevel literature allows us to reduce the cost of the step computation, which is a major bottleneck in single level procedures. Our work unifies theory behind the proximal trust-region methods and multilevel recursive strategies. We prove global convergence of our method in finite dimensional space and provide an efficient nonsmooth subproblem solver. We show the efficiency and robustness of our algorithm by means of numerical examples in PDE constrained optimization and machine-learning.

Paper Structure

This paper contains 15 sections, 12 theorems, 56 equations, 7 figures, 1 table, 3 algorithms.

Key Result

Lemma 4.1

Let $\widetilde{s_i}(t)\coloneqq\mathop{\mathrm{Prox}}\nolimits_{t\phi_i}(x_{i,k}-t\nabla \tilde{L}_i(x_{i,k}))-x_{i,k}$ for arbitrary $t>0$, and $s_i(t):=R^\top (p_{i-1}(t)-x_{i-1,0})=R^\top s_{i-1}^c(t)$. Then

Figures (7)

  • Figure 1: Optimal Control for Burgers with noise, target state with step (0.05), block (0.005), and impulse (0.2) noise
  • Figure 2: Convergence comparison with $n=8192$
  • Figure 3: Optimal control for semilinear PDE with $n=128$, $\alpha=10^{-4}$ and $\beta=0.01$
  • Figure 4: Optimal control for semilinear PDE with $n=128$, $\alpha=10^{-4}$ and $\beta=0.05$
  • Figure 5: Optimal control for semilinear PDE with $n=256$, $\alpha=10^{-4}$, $\beta=0.01$, and $\widehat{\sigma}=0.5$
  • ...and 2 more figures

Theorems & Definitions (17)

  • Definition 3.2
  • Remark 1
  • Definition 3.3
  • Remark 2
  • Remark 3
  • Lemma 4.1
  • Proposition 4.2
  • Lemma 5.1
  • Corollary 5.2
  • Lemma 5.3
  • ...and 7 more