Learning Pareto manifolds in high dimensions: How can regularization help?
Tobias Wegel, Filip Kovačević, Alexandru Ţifrea, Fanny Yang
TL;DR
This work tackles learning Pareto fronts in high-dimensional multi-objective learning with limited labeled data. It identifies the insufficiency of naive direct-regularization on scalarized objectives and introduces a two-stage estimator that first learns distributional parameters (potentially using unlabeled data) and then optimizes a scalarized MOL objective to recover the Pareto set. The authors establish upper bounds that propagate parameter estimation errors to Pareto-point errors and prove minimax lower bounds, showing the two-stage method is minimax-optimal under Lipschitz identifiability. The approach yields strong results in examples like multi-distribution sparse regression and fairness-risk trade-offs, validated by experiments with ensembles and hypernetworks that approximate the Pareto set. Overall, the paper offers a principled, theory-backed framework for efficient Pareto learning in high dimensions with practical implications for robust, fair, and multi-objective ML systems.
Abstract
Simultaneously addressing multiple objectives is becoming increasingly important in modern machine learning. At the same time, data is often high-dimensional and costly to label. For a single objective such as prediction risk, conventional regularization techniques are known to improve generalization when the data exhibits low-dimensional structure like sparsity. However, it is largely unexplored how to leverage this structure in the context of multi-objective learning (MOL) with multiple competing objectives. In this work, we discuss how the application of vanilla regularization approaches can fail, and propose a two-stage MOL framework that can successfully leverage low-dimensional structure. We demonstrate its effectiveness experimentally for multi-distribution learning and fairness-risk trade-offs.
