Efficient First-Order Optimization on the Pareto Set for Multi-Objective Learning under Preference Guidance
Lisha Chen, Quan Xiao, Ellen Hidemi Fukuda, Xinyi Chen, Kun Yuan, Tianyi Chen
TL;DR
This work tackles preference-guided multi-objective learning by formulating it as Optimization on the Pareto Set (OPS), a semivectorial bilevel problem. It introduces a smoothed merit function $v_{l,\tau}$ to scalarize vector-valued objectives and a penalty-based reformulation that connects solutions of the penalized problem to the original constrained OPS. A practical first-order algorithm, FOOPS, alternates between updating the Pareto-set variable $y$ and the decision variable $x$, with convergence guarantees under Hölderian error bounds and KL inequalities. Theoretical contributions include detailed properties of the merit function and its relations to weak Pareto optimality, while experiments across synthetic and real-world MOL tasks demonstrate competitive, preference-guided Pareto optimization. The framework offers a Hessian-free alternative to prior BLO methods with broad applicability to real systems requiring controlled trade-offs among objectives.
Abstract
Multi-objective learning under user-specified preference is common in real-world problems such as multi-lingual speech recognition under fairness. In this work, we frame such a problem as a semivectorial bilevel optimization problem, whose goal is to optimize a pre-defined preference function, subject to the constraint that the model parameters are weakly Pareto optimal. To solve this problem, we convert the multi-objective constraints to a single-objective constraint through a merit function with an easy-to-evaluate gradient, and then, we use a penalty-based reformulation of the bilevel optimization problem. We theoretically establish the properties of the merit function, and the relations of solutions for the penalty reformulation and the constrained formulation. Then we propose algorithms to solve the reformulated single-level problem, and establish its convergence guarantees. We test the method on various synthetic and real-world problems. The results demonstrate the effectiveness of the proposed method in finding preference-guided optimal solutions to the multi-objective problem.
