Table of Contents
Fetching ...

PolyNet: Learning Diverse Solution Strategies for Neural Combinatorial Optimization

André Hottung, Mridul Mahajan, Kevin Tierney

TL;DR

PolyNet tackles the exploration bottleneck in neural combinatorial optimization by learning multiple diverse solution strategies with a single decoder. It conditions solution generation on strategy bit-vectors, training via a best-of-$K$ update to yield a portfolio of complementary policies without enforcing diversity through handcrafted first actions. Across TSP, CVRP, CVRPTW, and FFSP, PolyNet delivers consistent improvements over state-of-the-art learning methods and matches or surpasses problem-specific solvers in several settings, while enabling efficient test-time search and richer diversity. The approach broadens applicability to complex constrained CO tasks and demonstrates that implicit, learned diversity can enhance both exploration and solution quality in practice.

Abstract

Reinforcement learning-based methods for constructing solutions to combinatorial optimization problems are rapidly approaching the performance of human-designed algorithms. To further narrow the gap, learning-based approaches must efficiently explore the solution space during the search process. Recent approaches artificially increase exploration by enforcing diverse solution generation through handcrafted rules, however, these rules can impair solution quality and are difficult to design for more complex problems. In this paper, we introduce PolyNet, an approach for improving exploration of the solution space by learning complementary solution strategies. In contrast to other works, PolyNet uses only a single-decoder and a training schema that does not enforce diverse solution generation through handcrafted rules. We evaluate PolyNet on four combinatorial optimization problems and observe that the implicit diversity mechanism allows PolyNet to find better solutions than approaches that explicitly enforce diverse solution generation.

PolyNet: Learning Diverse Solution Strategies for Neural Combinatorial Optimization

TL;DR

PolyNet tackles the exploration bottleneck in neural combinatorial optimization by learning multiple diverse solution strategies with a single decoder. It conditions solution generation on strategy bit-vectors, training via a best-of- update to yield a portfolio of complementary policies without enforcing diversity through handcrafted first actions. Across TSP, CVRP, CVRPTW, and FFSP, PolyNet delivers consistent improvements over state-of-the-art learning methods and matches or surpasses problem-specific solvers in several settings, while enabling efficient test-time search and richer diversity. The approach broadens applicability to complex constrained CO tasks and demonstrates that implicit, learned diversity can enhance both exploration and solution quality in practice.

Abstract

Reinforcement learning-based methods for constructing solutions to combinatorial optimization problems are rapidly approaching the performance of human-designed algorithms. To further narrow the gap, learning-based approaches must efficiently explore the solution space during the search process. Recent approaches artificially increase exploration by enforcing diverse solution generation through handcrafted rules, however, these rules can impair solution quality and are difficult to design for more complex problems. In this paper, we introduce PolyNet, an approach for improving exploration of the solution space by learning complementary solution strategies. In contrast to other works, PolyNet uses only a single-decoder and a training schema that does not enforce diverse solution generation through handcrafted rules. We evaluate PolyNet on four combinatorial optimization problems and observe that the implicit diversity mechanism allows PolyNet to find better solutions than approaches that explicitly enforce diverse solution generation.
Paper Structure (34 sections, 2 equations, 11 figures, 6 tables)

This paper contains 34 sections, 2 equations, 11 figures, 6 tables.

Figures (11)

  • Figure 1: PolyNet solution generation.
  • Figure 2: Decoder architecture.
  • Figure 3: Validation performance during training (log scale).
  • Figure 4: Contribution of different search strategies. Strategies are sorted based on their contribution.
  • Figure 5: Frequency plots for the number of unique first nodes.
  • ...and 6 more figures