Smooth Tchebycheff Scalarization for Multi-Objective Optimization
Xi Lin, Xiaoyuan Zhang, Zhiyuan Yang, Fei Liu, Zhenkun Wang, Qingfu Zhang
TL;DR
This work addresses differentiable multi-objective optimization by introducing Smooth Tchebycheff (STCH) scalarization, a differentiable log-sum-exp surrogate of the classic Tchebycheff approach. STCH preserves the Pareto-relevant trade-offs while enabling efficient gradient-based optimization and offering theoretical guarantees, including convergence to Pareto-stationary solutions and conditions under which all Pareto solutions can be recovered. The authors extend STCH to Pareto-set learning and demonstrate strong empirical performance on multi-task learning and Pareto-set learning benchmarks, often outperforming linear scalings and many adaptive-gradient baselines with lower computational overhead. While primarily focused on unconstrained, deterministic problems, the paper discusses extensions to constrained and stochastic settings and outlines future directions for achieving stronger global optimality guarantees.
Abstract
Multi-objective optimization problems can be found in many real-world applications, where the objectives often conflict each other and cannot be optimized by a single solution. In the past few decades, numerous methods have been proposed to find Pareto solutions that represent optimal trade-offs among the objectives for a given problem. However, these existing methods could have high computational complexity or may not have good theoretical properties for solving a general differentiable multi-objective optimization problem. In this work, by leveraging the smooth optimization technique, we propose a lightweight and efficient smooth Tchebycheff scalarization approach for gradient-based multi-objective optimization. It has good theoretical properties for finding all Pareto solutions with valid trade-off preferences, while enjoying significantly lower computational complexity compared to other methods. Experimental results on various real-world application problems fully demonstrate the effectiveness of our proposed method.
