Multi-Objective Bilevel Learning
Zhiyao Zhang, Zhuqing Liu, Xin Zhang, Wen-Yen Chen, Jiyan Yang, Jia Liu
TL;DR
This work addresses multi-objective bilevel learning (MOBL), where an upper-level vector objective $\Phi(x)$ depends on a lower-level optimal solution $y^*(x)$ of $g(x,y)$. It introduces the weighted-Chebyshev multi-hyper-gradient-descent (WC-MHGD) framework to achieve Pareto-stationary solutions with finite-time convergence in both deterministic and stochastic settings, while enabling systematic Pareto-front exploration guided by preferences $r\in\Delta_S^+$. The authors establish a Preference-Guided Minimum-Norm lemma and provide explicit oracle complexities for both NS and CG hypergradient computations, along with stochastic analogues. Empirical results on meta-learning and data cleaning tasks validate the theory, demonstrating preference-guided improvements and broad Pareto-front coverage compared to baselines. Overall, the work lays a theoretical and algorithmic foundation for MOBL with explicit preference-driven Pareto-front exploration and practical convergence guarantees.
Abstract
As machine learning (ML) applications grow increasingly complex in recent years, modern ML frameworks often need to address multiple potentially conflicting objectives with coupled decision variables across different layers. This creates a compelling need for multi-objective bilevel learning (MOBL). So far, however, the field of MOBL remains in its infancy and many important problems remain under-explored. This motivates us to fill this gap and systematically investigate the theoretical and algorithmic foundation of MOBL. Specifically, we consider MOBL problems with multiple conflicting objectives guided by preferences at the upper-level subproblem, where part of the inputs depend on the optimal solution of the lower-level subproblem. Our goal is to develop efficient MOBL optimization algorithms to (1) identify a preference-guided Pareto-stationary solution with low oracle complexity; and (2) enable systematic Pareto front exploration. To this end, we propose a unifying algorithmic framework called weighted-Chebyshev multi-hyper-gradient-descent (WC-MHGD) for both deterministic and stochastic settings with finite-time Pareto-stationarity convergence rate guarantees, which not only implies low oracle complexity but also induces systematic Pareto front exploration. We further conduct extensive experiments to confirm our theoretical results.
