TOP-Former: A Multi-Agent Transformer Approach for the Team Orienteering Problem
Daniel Fuertes, Carlos R. del-Blanco, Fernando Jaureguizar, Narciso García
TL;DR
This paper tackles the Team Orienteering Problem (TOP) by introducing TOP-Former, a centralized Transformer-based neural network that encodes the entire graph and the full fleet state to generate cooperative routes for multiple agents under a time limit $T$. Trained with deep reinforcement learning, TOP-Former employs an encoder-decoder architecture that computes a global graph embedding and autoregressively predicts multi-agent paths, ensuring feasibility via masking and a node-blocking strategy. Empirical results on synthetic TOP instances and the VDRPMDPC dataset show that TOP-Former delivers high-quality solutions with substantially faster inference times than state-of-the-art linear programming, heuristic, and neural approaches, supporting real-time decision-making in ITS and package delivery. The work demonstrates the value of global context and centralized attention for multi-agent VRPs, while acknowledging scalability challenges and outlining future directions toward decentralized or memory-efficient architectures for very large-scale problems.
Abstract
Route planning for a fleet of vehicles is an important task in applications such as package delivery, surveillance, or transportation, often integrated within larger Intelligent Transportation Systems (ITS). This problem is commonly formulated as a Vehicle Routing Problem (VRP) known as the Team Orienteering Problem (TOP). Existing solvers for this problem primarily rely on either linear programming, which provides accurate solutions but requires computation times that grow with the size of the problem, or heuristic methods, which typically find suboptimal solutions in a shorter time. In this paper, we introduce TOP-Former, a multi-agent route planning neural network designed to efficiently and accurately solve the Team Orienteering Problem. The proposed algorithm is based on a centralized Transformer neural network capable of learning to encode the scenario (modeled as a graph) and analyze the complete context of all agents to deliver fast, precise, and collaborative solutions. Unlike other neural network-based approaches that adopt a more local perspective, TOP-Former is trained to understand the global situation of the vehicle fleet and generate solutions that maximize long-term expected returns. Extensive experiments demonstrate that the presented system outperforms most state-of-the-art methods in terms of both accuracy and computation speed.
