Robo-taxi Fleet Coordination at Scale via Reinforcement Learning
Luigi Tresca, Carolin Schmidt, James Harrison, Filipe Rodrigues, Gioele Zardini, Daniele Gammelli, Marco Pavone
TL;DR
This work tackles large-scale AMoD fleet coordination by marrying optimization, graph representation learning, and reinforcement learning into a Graph RL framework. It introduces a hierarchical three-step policy: (i) convex dispatching for passenger matching, (ii) a learned per-node vehicle distribution guiding future states, and (iii) a minimum-cost flow to realize rebalancing, all facilitated by Graph Neural Networks. Across macroscopic and mesoscopic simulations, the approach achieves near-optimal profits close to MPC-Oracle with substantially reduced rebalancing costs and demonstrates strong transferability across cities, granularities, and simulator fidelities, aided by meta-RL and offline learning options. The open-source benchmarks, datasets, and simulators enable reproducible evaluation and standardized comparisons, advancing practical deployment of AMoD systems.
Abstract
Fleets of robo-taxis offering on-demand transportation services, commonly known as Autonomous Mobility-on-Demand (AMoD) systems, hold significant promise for societal benefits, such as reducing pollution, energy consumption, and urban congestion. However, orchestrating these systems at scale remains a critical challenge, with existing coordination algorithms often failing to exploit the systems' full potential. This work introduces a novel decision-making framework that unites mathematical modeling with data-driven techniques. In particular, we present the AMoD coordination problem through the lens of reinforcement learning and propose a graph network-based framework that exploits the main strengths of graph representation learning, reinforcement learning, and classical operations research tools. Extensive evaluations across diverse simulation fidelities and scenarios demonstrate the flexibility of our approach, achieving superior system performance, computational efficiency, and generalizability compared to prior methods. Finally, motivated by the need to democratize research efforts in this area, we release publicly available benchmarks, datasets, and simulators for network-level coordination alongside an open-source codebase designed to provide accessible simulation platforms and establish a standardized validation process for comparing methodologies. Code available at: https://github.com/StanfordASL/RL4AMOD
