Pareto-NRPA: A Novel Monte-Carlo Search Algorithm for Multi-Objective Optimization
Noé Lallouet, Tristan Cazenave, Cyrille Enderli
TL;DR
Pareto-NRPA generalizes the Nested Rollout Policy Adaptation framework to multi-objective optimization over discrete spaces by employing a set of policies that explore different regions and by updating policies via non-dominated fronts using crowding-distance weighting. The approach is validated on a new MO-TSPTW dataset and NAS benchmarks, showing competitive or superior hypervolume performance relative to state-of-the-art MOEAs, especially in constrained search spaces. The work demonstrates the feasibility of Monte-Carlo search with Pareto-front guidance for multi-objective, discrete problems and outlines several avenues for enhancement, including continuous-domain extension, higher-dimensional objectives, and efficiency improvements. Overall, Pareto-NRPA emerges as a strong candidate for constraint-aware, sequential MOO in discrete domains, with potential impact on routing, architecture search, and related optimization tasks.
Abstract
We introduce Pareto-NRPA, a new Monte-Carlo algorithm designed for multi-objective optimization problems over discrete search spaces. Extending the Nested Rollout Policy Adaptation (NRPA) algorithm originally formulated for single-objective problems, Pareto-NRPA generalizes the nested search and policy update mechanism to multi-objective optimization. The algorithm uses a set of policies to concurrently explore different regions of the solution space and maintains non-dominated fronts at each level of search. Policy adaptation is performed with respect to the diversity and isolation of sequences within the Pareto front. We benchmark Pareto-NRPA on two classes of problems: a novel bi-objective variant of the Traveling Salesman Problem with Time Windows problem (MO-TSPTW), and a neural architecture search task on well-known benchmarks. Results demonstrate that Pareto-NRPA achieves competitive performance against state-of-the-art multi-objective algorithms, both in terms of convergence and diversity of solutions. Particularly, Pareto-NRPA strongly outperforms state-of-the-art evolutionary multi-objective algorithms on constrained search spaces. To our knowledge, this work constitutes the first adaptation of NRPA to the multi-objective setting.
