Multi-objective Optimization by Learning Space Partitions
Yiyang Zhao, Linnan Wang, Kevin Yang, Tianjun Zhang, Tian Guo, Yuandong Tian
TL;DR
LaMOO tackles multi-objective optimization by learning data-driven partitions of the search space guided by dominance information and refined by Monte Carlo Tree Search to balance exploration and exploitation. The method acts as a meta-optimizer that wraps inner MOO/SOO solvers (e.g., qEHVI or CMA-ES) within a learned region hierarchy to focus sampling near the Pareto frontier. Theoretical analysis provides conditions under which learning partitions improves sample efficiency, and empirical results across synthetic benchmarks, NAS on NasBench201, vehicle safety design, and molecule discovery show substantial HV gains and reduced sample counts. The approach yields significant practical impact for expensive black-box MOO problems by reducing evaluations while maintaining Pareto coverage.
Abstract
In contrast to single-objective optimization (SOO), multi-objective optimization (MOO) requires an optimizer to find the Pareto frontier, a subset of feasible solutions that are not dominated by other feasible solutions. In this paper, we propose LaMOO, a novel multi-objective optimizer that learns a model from observed samples to partition the search space and then focus on promising regions that are likely to contain a subset of the Pareto frontier. The partitioning is based on the dominance number, which measures "how close" a data point is to the Pareto frontier among existing samples. To account for possible partition errors due to limited samples and model mismatch, we leverage Monte Carlo Tree Search (MCTS) to exploit promising regions while exploring suboptimal regions that may turn out to contain good solutions later. Theoretically, we prove the efficacy of learning space partitioning via LaMOO under certain assumptions. Empirically, on the HyperVolume (HV) benchmark, a popular MOO metric, LaMOO substantially outperforms strong baselines on multiple real-world MOO tasks, by up to 225% in sample efficiency for neural architecture search on Nasbench201, and up to 10% for molecular design.
