Table of Contents
Fetching ...

Complexity Bounds for Smooth Multiobjective Optimization

Phillipe R. Sampaio

TL;DR

This work develops an information-based, oracle-complexity theory for finding $\varepsilon$-Pareto stationary points in smooth multiobjective optimization by introducing a robust non-degenerate lifting that embeds scalar hard instances into MOO with distinct objectives and non-singleton Pareto fronts. The lifting ensures the Pareto stationarity gap $\mathcal{G}(x)$ dominates the scalar gradient measure, enabling transfer of sharp single-objective lower bounds to the MOO setting. The authors establish tight linear lower bounds for strongly convex MOO, a separation between oblivious one-step and oblivious span methods in the convex regime, and universal lower bounds for adaptive methods via geometric arguments, as well as a nonconvex extension with Lipschitz gradients. They also provide upper bounds via scalarization and rate comparisons, showing that fixed-scalarization AGD achieves matching order rates, and discuss open questions on last-iterate behavior, stochastic settings, and broader extensions.

Abstract

We study the oracle complexity of finding $\varepsilon$-Pareto stationary points in smooth multiobjective optimization with $m$ objectives. Progress is measured by the Pareto stationarity gap $\mathcal{G}(x)$, the norm of the best convex combination of objective gradients. Our analysis relies on a non-degenerate lifting that embeds hard single-objective instances into MOO instances with distinct objectives and non-singleton Pareto fronts while preserving lower bounds on $\mathcal{G}$. We establish: (i) in the $μ$-strongly convex case, any span first-order method has worst-case linear convergence no faster than $\exp(-Θ(T/\sqrtκ))$ after $T$ oracle calls, yielding $Θ(\sqrtκ\log(1/\varepsilon))$ iterations and matching accelerated upper bounds; (ii) in the convex case, an $Ω(1/T)$ min-iterate lower bound for oblivious one-step methods and a universal last-iterate lower bound $Ω(1/T^2)$ for oblivious span methods via polynomial-degree arguments, and we further show this latter bound is loose (for general adaptive methods) by importing geometric lower bounds to obtain an $Ω(1/T)$ min-iterate lower bound for general adaptive first-order methods; (iii) in the nonconvex case with $L$-Lipschitz gradients, an $Ω(\sqrt{L}/(T+1))$-type lower bound on $\mathcal{G}$ (tight in order), implying $Ω(1/\varepsilon^2)$ iterations to reach $\mathcal{G}(x)\le\varepsilon$ up to natural scaling.

Complexity Bounds for Smooth Multiobjective Optimization

TL;DR

This work develops an information-based, oracle-complexity theory for finding -Pareto stationary points in smooth multiobjective optimization by introducing a robust non-degenerate lifting that embeds scalar hard instances into MOO with distinct objectives and non-singleton Pareto fronts. The lifting ensures the Pareto stationarity gap dominates the scalar gradient measure, enabling transfer of sharp single-objective lower bounds to the MOO setting. The authors establish tight linear lower bounds for strongly convex MOO, a separation between oblivious one-step and oblivious span methods in the convex regime, and universal lower bounds for adaptive methods via geometric arguments, as well as a nonconvex extension with Lipschitz gradients. They also provide upper bounds via scalarization and rate comparisons, showing that fixed-scalarization AGD achieves matching order rates, and discuss open questions on last-iterate behavior, stochastic settings, and broader extensions.

Abstract

We study the oracle complexity of finding -Pareto stationary points in smooth multiobjective optimization with objectives. Progress is measured by the Pareto stationarity gap , the norm of the best convex combination of objective gradients. Our analysis relies on a non-degenerate lifting that embeds hard single-objective instances into MOO instances with distinct objectives and non-singleton Pareto fronts while preserving lower bounds on . We establish: (i) in the -strongly convex case, any span first-order method has worst-case linear convergence no faster than after oracle calls, yielding iterations and matching accelerated upper bounds; (ii) in the convex case, an min-iterate lower bound for oblivious one-step methods and a universal last-iterate lower bound for oblivious span methods via polynomial-degree arguments, and we further show this latter bound is loose (for general adaptive methods) by importing geometric lower bounds to obtain an min-iterate lower bound for general adaptive first-order methods; (iii) in the nonconvex case with -Lipschitz gradients, an -type lower bound on (tight in order), implying iterations to reach up to natural scaling.

Paper Structure

This paper contains 34 sections, 22 theorems, 144 equations.

Key Result

Lemma 3.6

For $C^1$ objectives $(f_i)$, the following are equivalent at a point $x$:

Theorems & Definitions (58)

  • Definition 3.1: Dominance
  • Definition 3.2: Pareto optimality (strong)
  • Definition 3.3: Weak Pareto optimality
  • Definition 3.4: Pareto criticality
  • Definition 3.5: Pareto stationarity gap
  • Lemma 3.6: No common descent $\Longleftrightarrow$ convex-hull stationarity
  • proof
  • Proposition 3.7: General $C^1$ case: necessity
  • proof
  • Proposition 3.8: Convex $C^1$ case: characterization of weak Pareto optima
  • ...and 48 more