Table of Contents
Fetching ...

PARCO: Parallel AutoRegressive Models for Multi-Agent Combinatorial Optimization

Federico Berto, Chuanbo Hua, Laurin Luttmann, Jiwoo Son, Junyoung Park, Kyuree Ahn, Changhyun Kwon, Lin Xie, Jinkyoo Park

TL;DR

PARCO tackles multi-agent combinatorial optimization by enabling parallel solution construction with a transformer-based communication layer, a multiple pointer mechanism, and priority-based conflict handling. It treats multi-agent CO as a cooperative MDP and uses centralized training with REINFORCE. Across HCVRP, OMDCPDP, and FFSP, PARCO consistently outperforms state-of-the-art learning solvers and scales to large problem sizes with substantial speedups. The paper demonstrates strong generalization to unseen numbers of nodes and agents and releases open-source code for reproducibility. The work advances practical real-time optimization for logistics and scheduling with broad applicability.

Abstract

Combinatorial optimization problems involving multiple agents are notoriously challenging due to their NP-hard nature and the necessity for effective agent coordination. Despite advancements in learning-based methods, existing approaches often face critical limitations, including suboptimal agent coordination, poor generalization, and high computational latency. To address these issues, we propose PARCO (Parallel AutoRegressive Combinatorial Optimization), a general reinforcement learning framework designed to construct high-quality solutions for multi-agent combinatorial tasks efficiently. To this end, PARCO integrates three key novel components: (1) transformer-based communication layers to enable effective agent collaboration during parallel solution construction, (2) a multiple pointer mechanism for low-latency, parallel agent decision-making, and (3) priority-based conflict handlers to resolve decision conflicts via learned priorities. We evaluate PARCO in multi-agent vehicle routing and scheduling problems, where our approach outperforms state-of-the-art learning methods, demonstrating strong generalization ability and remarkable computational efficiency. We make our source code publicly available to foster future research: https://github.com/ai4co/parco.

PARCO: Parallel AutoRegressive Models for Multi-Agent Combinatorial Optimization

TL;DR

PARCO tackles multi-agent combinatorial optimization by enabling parallel solution construction with a transformer-based communication layer, a multiple pointer mechanism, and priority-based conflict handling. It treats multi-agent CO as a cooperative MDP and uses centralized training with REINFORCE. Across HCVRP, OMDCPDP, and FFSP, PARCO consistently outperforms state-of-the-art learning solvers and scales to large problem sizes with substantial speedups. The paper demonstrates strong generalization to unseen numbers of nodes and agents and releases open-source code for reproducibility. The work advances practical real-time optimization for logistics and scheduling with broad applicability.

Abstract

Combinatorial optimization problems involving multiple agents are notoriously challenging due to their NP-hard nature and the necessity for effective agent coordination. Despite advancements in learning-based methods, existing approaches often face critical limitations, including suboptimal agent coordination, poor generalization, and high computational latency. To address these issues, we propose PARCO (Parallel AutoRegressive Combinatorial Optimization), a general reinforcement learning framework designed to construct high-quality solutions for multi-agent combinatorial tasks efficiently. To this end, PARCO integrates three key novel components: (1) transformer-based communication layers to enable effective agent collaboration during parallel solution construction, (2) a multiple pointer mechanism for low-latency, parallel agent decision-making, and (3) priority-based conflict handlers to resolve decision conflicts via learned priorities. We evaluate PARCO in multi-agent vehicle routing and scheduling problems, where our approach outperforms state-of-the-art learning methods, demonstrating strong generalization ability and remarkable computational efficiency. We make our source code publicly available to foster future research: https://github.com/ai4co/parco.
Paper Structure (97 sections, 14 equations, 6 figures, 7 tables, 1 algorithm)

This paper contains 97 sections, 14 equations, 6 figures, 7 tables, 1 algorithm.

Figures (6)

  • Figure 1: PARCO generates better solutions with higher efficiency through parallel decision making.
  • Figure 2: Overview of PARCO. Our model encodes multi-agent CO problems into separate agent and node embeddings. Communication Layers allow for coordination among agents during decoding, which enhances solution quality. Actions are decoded efficiently autoregressively in parallel through a Multiple Pointer Mechanism enhanced by a Priority-based Conflict Handler.
  • Figure 3: Analysis of PARCO components.
  • Figure 4: PARCO vs AR inference time. PARCO constructs solutions faster with more agents $M$.
  • Figure 5: Real-world instance for the OMDCPDP problem in Seoul City, South Korea, with $N=1000$ locations and $m=100$ agents () showing relations ( --) of pickups () and their respective deliveries ().
  • ...and 1 more figures