Table of Contents
Fetching ...

Pareto-NRPA: A Novel Monte-Carlo Search Algorithm for Multi-Objective Optimization

Noé Lallouet, Tristan Cazenave, Cyrille Enderli

TL;DR

Pareto-NRPA generalizes the Nested Rollout Policy Adaptation framework to multi-objective optimization over discrete spaces by employing a set of policies that explore different regions and by updating policies via non-dominated fronts using crowding-distance weighting. The approach is validated on a new MO-TSPTW dataset and NAS benchmarks, showing competitive or superior hypervolume performance relative to state-of-the-art MOEAs, especially in constrained search spaces. The work demonstrates the feasibility of Monte-Carlo search with Pareto-front guidance for multi-objective, discrete problems and outlines several avenues for enhancement, including continuous-domain extension, higher-dimensional objectives, and efficiency improvements. Overall, Pareto-NRPA emerges as a strong candidate for constraint-aware, sequential MOO in discrete domains, with potential impact on routing, architecture search, and related optimization tasks.

Abstract

We introduce Pareto-NRPA, a new Monte-Carlo algorithm designed for multi-objective optimization problems over discrete search spaces. Extending the Nested Rollout Policy Adaptation (NRPA) algorithm originally formulated for single-objective problems, Pareto-NRPA generalizes the nested search and policy update mechanism to multi-objective optimization. The algorithm uses a set of policies to concurrently explore different regions of the solution space and maintains non-dominated fronts at each level of search. Policy adaptation is performed with respect to the diversity and isolation of sequences within the Pareto front. We benchmark Pareto-NRPA on two classes of problems: a novel bi-objective variant of the Traveling Salesman Problem with Time Windows problem (MO-TSPTW), and a neural architecture search task on well-known benchmarks. Results demonstrate that Pareto-NRPA achieves competitive performance against state-of-the-art multi-objective algorithms, both in terms of convergence and diversity of solutions. Particularly, Pareto-NRPA strongly outperforms state-of-the-art evolutionary multi-objective algorithms on constrained search spaces. To our knowledge, this work constitutes the first adaptation of NRPA to the multi-objective setting.

Pareto-NRPA: A Novel Monte-Carlo Search Algorithm for Multi-Objective Optimization

TL;DR

Pareto-NRPA generalizes the Nested Rollout Policy Adaptation framework to multi-objective optimization over discrete spaces by employing a set of policies that explore different regions and by updating policies via non-dominated fronts using crowding-distance weighting. The approach is validated on a new MO-TSPTW dataset and NAS benchmarks, showing competitive or superior hypervolume performance relative to state-of-the-art MOEAs, especially in constrained search spaces. The work demonstrates the feasibility of Monte-Carlo search with Pareto-front guidance for multi-objective, discrete problems and outlines several avenues for enhancement, including continuous-domain extension, higher-dimensional objectives, and efficiency improvements. Overall, Pareto-NRPA emerges as a strong candidate for constraint-aware, sequential MOO in discrete domains, with potential impact on routing, architecture search, and related optimization tasks.

Abstract

We introduce Pareto-NRPA, a new Monte-Carlo algorithm designed for multi-objective optimization problems over discrete search spaces. Extending the Nested Rollout Policy Adaptation (NRPA) algorithm originally formulated for single-objective problems, Pareto-NRPA generalizes the nested search and policy update mechanism to multi-objective optimization. The algorithm uses a set of policies to concurrently explore different regions of the solution space and maintains non-dominated fronts at each level of search. Policy adaptation is performed with respect to the diversity and isolation of sequences within the Pareto front. We benchmark Pareto-NRPA on two classes of problems: a novel bi-objective variant of the Traveling Salesman Problem with Time Windows problem (MO-TSPTW), and a neural architecture search task on well-known benchmarks. Results demonstrate that Pareto-NRPA achieves competitive performance against state-of-the-art multi-objective algorithms, both in terms of convergence and diversity of solutions. Particularly, Pareto-NRPA strongly outperforms state-of-the-art evolutionary multi-objective algorithms on constrained search spaces. To our knowledge, this work constitutes the first adaptation of NRPA to the multi-objective setting.

Paper Structure

This paper contains 12 sections, 4 equations, 8 figures, 19 tables, 3 algorithms.

Figures (8)

  • Figure 1: The two independent cost matrices for instance rc_204.2 of the TSPTW
  • Figure 2: Different values of $|\Pi|$ and $\alpha$ on rc_205.2
  • Figure 3: Policy distribution for two different runs on rc_204.3. Each color represents the policy from which the solution has been sampled.
  • Figure 4: Aggregated Pareto fronts on NAS-Bench-201 and NAS-Bench-101
  • Figure 5: Impact of bias on EMOA convergence on rc_205.3
  • ...and 3 more figures