Hierarchical Optimization via LLM-Guided Objective Evolution for Mobility-on-Demand Systems
Yi Zhang, Yushen Long, Yun Ni, Liping Huang, Xiaohong Wang, Jun Liu
TL;DR
The paper addresses the data-efficiency and constraint-enforcement challenges of RL in mobility-on-demand by introducing a training-free hybrid framework where an LLM acts as a meta-optimizer to evolve high-level objectives that guide a low-level, constraint-conscious routing and dispatch optimizer. A harmony search–driven prompt evolution loop enables closed-loop refinement of semantic objectives using solver feedback, bridging high-level reasoning with low-level dynamics. Across NYC and Chicago taxi datasets, the approach achieves a mean improvement of about $16\%$ in passenger waiting time over state-of-the-art baselines, with pronounced gains in large-scale, imbalanced scenarios, demonstrating robustness and practicality. The work highlights a scalable, data-efficient path for dynamic decision-making that integrates semantic reasoning with rigorous optimization, potentially informing real-time, constraint-aware ride-hailing systems.
Abstract
Online ride-hailing platforms aim to deliver efficient mobility-on-demand services, often facing challenges in balancing dynamic and spatially heterogeneous supply and demand. Existing methods typically fall into two categories: reinforcement learning (RL) approaches, which suffer from data inefficiency, oversimplified modeling of real-world dynamics, and difficulty enforcing operational constraints; or decomposed online optimization methods, which rely on manually designed high-level objectives that lack awareness of low-level routing dynamics. To address this issue, we propose a novel hybrid framework that integrates large language model (LLM) with mathematical optimization in a dynamic hierarchical system: (1) it is training-free, removing the need for large-scale interaction data as in RL, and (2) it leverages LLM to bridge cognitive limitations caused by problem decomposition by adaptively generating high-level objectives. Within this framework, LLM serves as a meta-optimizer, producing semantic heuristics that guide a low-level optimizer responsible for constraint enforcement and real-time decision execution. These heuristics are refined through a closed-loop evolutionary process, driven by harmony search, which iteratively adapts the LLM prompts based on feasibility and performance feedback from the optimization layer. Extensive experiments based on scenarios derived from both the New York and Chicago taxi datasets demonstrate the effectiveness of our approach, achieving an average improvement of 16% compared to state-of-the-art baselines.
