Surge Routing: Event-informed Multiagent Reinforcement Learning for Autonomous Rideshare
Daniel Garces, Stephanie Gil
TL;DR
This work tackles surge-demand in urban ridesharing by fusing event-driven demand prediction from internet data with a scalable, event-informed, multiagent reinforcement learning routing framework. The approach comprises an event-processing and demand-prediction pipeline that uses sentence embeddings and spectral clustering to produce sector-level demand estimates, which are then mapped to intersections via occupancy-aware assignment for RL routing. A rollout-based, one-agent-at-a-time controller with limited sampling (certainty-equivalence) enables city-scale planning while keeping computation tractable. Experimental results on NYC HV-FHV data show substantial improvements in wait-time overhead (roughly 25-75% reductions) and increased serviced requests (about 1-4%), validating both the predictive and planning components and their integration in large-scale urban systems.
Abstract
Large events such as conferences, concerts and sports games, often cause surges in demand for ride services that are not captured in average demand patterns, posing unique challenges for routing algorithms. We propose a learning framework for an autonomous fleet of taxis that leverages event data from the internet to predict demand surges and generate cooperative routing policies. We achieve this through a combination of two major components: (i) a demand prediction framework that uses textual event information in the form of events' descriptions and reviews to predict event-driven demand surges over street intersections, and (ii) a scalable multiagent reinforcement learning framework that leverages demand predictions and uses one-agent-at-a-time rollout combined with limited sampling certainty equivalence to learn intersection-level routing policies. For our experimental results we consider real NYC ride share data for the year 2022 and information for more than 2000 events across 300 unique venues in Manhattan. We test our approach with a fleet of 100 taxis on a map with 2235 street intersections. Our experimental results demonstrate that our method learns routing policies that reduce wait time overhead per serviced request by 25% to 75%, while picking up 1% to 4% more requests than other model-based RL frameworks and classical methods in operations research.
