PolyNet: Learning Diverse Solution Strategies for Neural Combinatorial Optimization
André Hottung, Mridul Mahajan, Kevin Tierney
TL;DR
PolyNet tackles the exploration bottleneck in neural combinatorial optimization by learning multiple diverse solution strategies with a single decoder. It conditions solution generation on strategy bit-vectors, training via a best-of-$K$ update to yield a portfolio of complementary policies without enforcing diversity through handcrafted first actions. Across TSP, CVRP, CVRPTW, and FFSP, PolyNet delivers consistent improvements over state-of-the-art learning methods and matches or surpasses problem-specific solvers in several settings, while enabling efficient test-time search and richer diversity. The approach broadens applicability to complex constrained CO tasks and demonstrates that implicit, learned diversity can enhance both exploration and solution quality in practice.
Abstract
Reinforcement learning-based methods for constructing solutions to combinatorial optimization problems are rapidly approaching the performance of human-designed algorithms. To further narrow the gap, learning-based approaches must efficiently explore the solution space during the search process. Recent approaches artificially increase exploration by enforcing diverse solution generation through handcrafted rules, however, these rules can impair solution quality and are difficult to design for more complex problems. In this paper, we introduce PolyNet, an approach for improving exploration of the solution space by learning complementary solution strategies. In contrast to other works, PolyNet uses only a single-decoder and a training schema that does not enforce diverse solution generation through handcrafted rules. We evaluate PolyNet on four combinatorial optimization problems and observe that the implicit diversity mechanism allows PolyNet to find better solutions than approaches that explicitly enforce diverse solution generation.
