Table of Contents
Fetching ...

Smooth Tchebycheff Scalarization for Multi-Objective Optimization

Xi Lin, Xiaoyuan Zhang, Zhiyuan Yang, Fei Liu, Zhenkun Wang, Qingfu Zhang

TL;DR

This work addresses differentiable multi-objective optimization by introducing Smooth Tchebycheff (STCH) scalarization, a differentiable log-sum-exp surrogate of the classic Tchebycheff approach. STCH preserves the Pareto-relevant trade-offs while enabling efficient gradient-based optimization and offering theoretical guarantees, including convergence to Pareto-stationary solutions and conditions under which all Pareto solutions can be recovered. The authors extend STCH to Pareto-set learning and demonstrate strong empirical performance on multi-task learning and Pareto-set learning benchmarks, often outperforming linear scalings and many adaptive-gradient baselines with lower computational overhead. While primarily focused on unconstrained, deterministic problems, the paper discusses extensions to constrained and stochastic settings and outlines future directions for achieving stronger global optimality guarantees.

Abstract

Multi-objective optimization problems can be found in many real-world applications, where the objectives often conflict each other and cannot be optimized by a single solution. In the past few decades, numerous methods have been proposed to find Pareto solutions that represent optimal trade-offs among the objectives for a given problem. However, these existing methods could have high computational complexity or may not have good theoretical properties for solving a general differentiable multi-objective optimization problem. In this work, by leveraging the smooth optimization technique, we propose a lightweight and efficient smooth Tchebycheff scalarization approach for gradient-based multi-objective optimization. It has good theoretical properties for finding all Pareto solutions with valid trade-off preferences, while enjoying significantly lower computational complexity compared to other methods. Experimental results on various real-world application problems fully demonstrate the effectiveness of our proposed method.

Smooth Tchebycheff Scalarization for Multi-Objective Optimization

TL;DR

This work addresses differentiable multi-objective optimization by introducing Smooth Tchebycheff (STCH) scalarization, a differentiable log-sum-exp surrogate of the classic Tchebycheff approach. STCH preserves the Pareto-relevant trade-offs while enabling efficient gradient-based optimization and offering theoretical guarantees, including convergence to Pareto-stationary solutions and conditions under which all Pareto solutions can be recovered. The authors extend STCH to Pareto-set learning and demonstrate strong empirical performance on multi-task learning and Pareto-set learning benchmarks, often outperforming linear scalings and many adaptive-gradient baselines with lower computational overhead. While primarily focused on unconstrained, deterministic problems, the paper discusses extensions to constrained and stochastic settings and outlines future directions for achieving stronger global optimality guarantees.

Abstract

Multi-objective optimization problems can be found in many real-world applications, where the objectives often conflict each other and cannot be optimized by a single solution. In the past few decades, numerous methods have been proposed to find Pareto solutions that represent optimal trade-offs among the objectives for a given problem. However, these existing methods could have high computational complexity or may not have good theoretical properties for solving a general differentiable multi-objective optimization problem. In this work, by leveraging the smooth optimization technique, we propose a lightweight and efficient smooth Tchebycheff scalarization approach for gradient-based multi-objective optimization. It has good theoretical properties for finding all Pareto solutions with valid trade-off preferences, while enjoying significantly lower computational complexity compared to other methods. Experimental results on various real-world application problems fully demonstrate the effectiveness of our proposed method.
Paper Structure (62 sections, 9 theorems, 48 equations, 7 figures, 11 tables, 1 algorithm)

This paper contains 62 sections, 9 theorems, 48 equations, 7 figures, 11 tables, 1 algorithm.

Key Result

Theorem 2.3

A feasible solution $\boldsymbol{x} \in \mathcal{X}$ is weakly Pareto optimal for the original problem (eq_mop) if and only if there exists a valid preference vector $\boldsymbol{\lambda}$ such that $\boldsymbol{x}$ is an optimal solution of the Tchebycheff scalarization problem (eq_tch_scalarizatio

Figures (7)

  • Figure 1: Different multi-objective optimization methods.(a) The Pareto Front is the achievable boundary of the feasible region that represents different (maybe infinite) optimal trade-offs among the objectives. (b) Adaptive Gradient Algorithm aims to find a valid gradient direction to improve the performance of all objectives, which involves solving a quadratic programming problem at each iteration. (c) Linear scalarization cannot find any Pareto solution on the non-convex part of the Pareto front, of which those solutions do not have supporting hyperplanes. (d) Tchebycheff (TCH) Scalarization is capable of finding all Pareto solutions, but requires a large number of iterations. (e) Smooth Tchebycheff (STCH) Scalarization proposed in this work can find all Pareto solutions under mild conditions, while enjoying a much faster convergence speed.
  • Figure 2: The advantage of our proposed smooth Tchebycheff scalarization.(a) Problem & Target: We want to find a Pareto solution with an exact trade-off $\boldsymbol{\lambda} = (0.5,0.5)$ on the Pareto front. (b) Classical Tchebycheff (TCH) scalarization suffers from a slow convergence speed with an oscillation trajectory. (c) Our proposed smooth Tchebycheff (STCH) scalarization quickly converges to the exact target solution with a smooth trajectory. (d) & (e) The mean/median gaps v.s. number of function evaluations of different methods to the target objective value with $100$ trials.
  • Figure 3: Smoothing a Nonsmooth Function:(a) The simple TCH scalarization function $g(x) = \max(f_1(x) = -x,f_2(x) = x)$ and its corresponding STCH scalarization with different smoothing parameters $\mu = 1$ and $\mu = 0.2$. The $g(x)$ is tightly bounded from above and below with a small $\mu$. (b) The gradient and smoothed gradients. TCH scalarization does not have gradient at $x = 0$ while STCH is differentiable everywhere.
  • Figure 4: The learned Pareto fronts for the 3-objective rocket injector design problem with different scalarization methods.
  • Figure 5: The level surfaces of smooth Tchebycheff (STCH) scalarization and different Pareto fronts.
  • ...and 2 more figures

Theorems & Definitions (17)

  • Definition 2.1: Dominance and Strict Dominance
  • Definition 2.2: (Weakly) Pareto Optimality
  • Theorem 2.3
  • Definition 2.4: Pareto Stationary Solution
  • Definition 3.1: Smoothness
  • Definition 3.2: Smoothing Function
  • Proposition 3.3: Smooth Approximation
  • Proposition 3.4: Bounded Approximation
  • Lemma 3.5: Convexity
  • Proposition 3.6: Iteration Complexity
  • ...and 7 more