Table of Contents
Fetching ...

Multi-Objective Bilevel Learning

Zhiyao Zhang, Zhuqing Liu, Xin Zhang, Wen-Yen Chen, Jiyan Yang, Jia Liu

TL;DR

This work addresses multi-objective bilevel learning (MOBL), where an upper-level vector objective $\Phi(x)$ depends on a lower-level optimal solution $y^*(x)$ of $g(x,y)$. It introduces the weighted-Chebyshev multi-hyper-gradient-descent (WC-MHGD) framework to achieve Pareto-stationary solutions with finite-time convergence in both deterministic and stochastic settings, while enabling systematic Pareto-front exploration guided by preferences $r\in\Delta_S^+$. The authors establish a Preference-Guided Minimum-Norm lemma and provide explicit oracle complexities for both NS and CG hypergradient computations, along with stochastic analogues. Empirical results on meta-learning and data cleaning tasks validate the theory, demonstrating preference-guided improvements and broad Pareto-front coverage compared to baselines. Overall, the work lays a theoretical and algorithmic foundation for MOBL with explicit preference-driven Pareto-front exploration and practical convergence guarantees.

Abstract

As machine learning (ML) applications grow increasingly complex in recent years, modern ML frameworks often need to address multiple potentially conflicting objectives with coupled decision variables across different layers. This creates a compelling need for multi-objective bilevel learning (MOBL). So far, however, the field of MOBL remains in its infancy and many important problems remain under-explored. This motivates us to fill this gap and systematically investigate the theoretical and algorithmic foundation of MOBL. Specifically, we consider MOBL problems with multiple conflicting objectives guided by preferences at the upper-level subproblem, where part of the inputs depend on the optimal solution of the lower-level subproblem. Our goal is to develop efficient MOBL optimization algorithms to (1) identify a preference-guided Pareto-stationary solution with low oracle complexity; and (2) enable systematic Pareto front exploration. To this end, we propose a unifying algorithmic framework called weighted-Chebyshev multi-hyper-gradient-descent (WC-MHGD) for both deterministic and stochastic settings with finite-time Pareto-stationarity convergence rate guarantees, which not only implies low oracle complexity but also induces systematic Pareto front exploration. We further conduct extensive experiments to confirm our theoretical results.

Multi-Objective Bilevel Learning

TL;DR

This work addresses multi-objective bilevel learning (MOBL), where an upper-level vector objective depends on a lower-level optimal solution of . It introduces the weighted-Chebyshev multi-hyper-gradient-descent (WC-MHGD) framework to achieve Pareto-stationary solutions with finite-time convergence in both deterministic and stochastic settings, while enabling systematic Pareto-front exploration guided by preferences . The authors establish a Preference-Guided Minimum-Norm lemma and provide explicit oracle complexities for both NS and CG hypergradient computations, along with stochastic analogues. Empirical results on meta-learning and data cleaning tasks validate the theory, demonstrating preference-guided improvements and broad Pareto-front coverage compared to baselines. Overall, the work lays a theoretical and algorithmic foundation for MOBL with explicit preference-driven Pareto-front exploration and practical convergence guarantees.

Abstract

As machine learning (ML) applications grow increasingly complex in recent years, modern ML frameworks often need to address multiple potentially conflicting objectives with coupled decision variables across different layers. This creates a compelling need for multi-objective bilevel learning (MOBL). So far, however, the field of MOBL remains in its infancy and many important problems remain under-explored. This motivates us to fill this gap and systematically investigate the theoretical and algorithmic foundation of MOBL. Specifically, we consider MOBL problems with multiple conflicting objectives guided by preferences at the upper-level subproblem, where part of the inputs depend on the optimal solution of the lower-level subproblem. Our goal is to develop efficient MOBL optimization algorithms to (1) identify a preference-guided Pareto-stationary solution with low oracle complexity; and (2) enable systematic Pareto front exploration. To this end, we propose a unifying algorithmic framework called weighted-Chebyshev multi-hyper-gradient-descent (WC-MHGD) for both deterministic and stochastic settings with finite-time Pareto-stationarity convergence rate guarantees, which not only implies low oracle complexity but also induces systematic Pareto front exploration. We further conduct extensive experiments to confirm our theoretical results.

Paper Structure

This paper contains 22 sections, 12 theorems, 76 equations, 12 figures, 2 tables, 4 algorithms.

Key Result

Lemma 4.1

A solution $x^*$ is weakly Pareto-optimal to an MOO problem $\min_{x} \Phi(x)$ if and only if $x^* \in \arg\min_{x} \mathsf{WC}_{r}(\Phi(x))$ for some $r \in \Delta_S^+$.

Figures (12)

  • Figure 1: Pareto front exploration of \ref{['alg:deterministic']}.
  • Figure 2: Comparison between algorithms.
  • Figure 3: Geometric interpretation of Eq. \ref{['eq:WC']}.
  • Figure 4: Performances of \ref{['alg:deterministic']}.
  • Figure 5: The accuracy with different preference vector $r$.
  • ...and 7 more figures

Theorems & Definitions (23)

  • Definition 3.1: Pareto Optimality
  • Definition 3.2: Pareto Stationarity
  • Definition 3.3: $\epsilon$-Pareto Stationarity
  • Lemma 4.1: Pareto Optimality Equivalence
  • Definition 4.2: Oracle Complexity of MOBL Algorithms
  • Lemma 4.3: Pareto Stationarity
  • Lemma 5.4: Preference-Guided Minimum-Norm Lemma
  • Theorem 5.5: Finite-Time Convergence Rate of Deterministic $\mathsf{WC}$-$\mathsf{MHGD}$
  • Corollary 5.6: Oracle Complexity of Deterministic $\mathsf{WC}$-$\mathsf{MHGD}$
  • Theorem 5.7: Finite-Time Convergence Rate of Stochastic $\mathsf{WC}$-$\mathsf{MHGD}$
  • ...and 13 more