Table of Contents
Fetching ...

DiMA: An LLM-Powered Ride-Hailing Assistant at DiDi

Yansong Ning, Shuowei Cai, Wei Li, Jun Fang, Naiqiang Tan, Hua Chai, Hao Liu

TL;DR

DiMA addresses the need for an LLM-powered ride-hailing assistant capable of spatiotemporal reasoning and proactive ordering in urban contexts. It introduces a spatiotemporal tool-augmented order planning module, a cost-effective dialog system with multi-type repliers and cost-aware configurations, and a continual fine-tuning loop leveraging real-world data and role-play simulations. Deployed online in DiDi since May 2024, it achieves 93% order planning accuracy and 92% response generation, with offline gains up to 70.23% and 321.27% over baselines and latency reductions. The approach demonstrates a practical, scalable path to intelligent mobile ride-hailing assistants and is released with code and access to the MCP service for community research.

Abstract

On-demand ride-hailing services like DiDi, Uber, and Lyft have transformed urban transportation, offering unmatched convenience and flexibility. In this paper, we introduce DiMA, an LLM-powered ride-hailing assistant deployed in DiDi Chuxing. Its goal is to provide seamless ride-hailing services and beyond through a natural and efficient conversational interface under dynamic and complex spatiotemporal urban contexts. To achieve this, we propose a spatiotemporal-aware order planning module that leverages external tools for precise spatiotemporal reasoning and progressive order planning. Additionally, we develop a cost-effective dialogue system that integrates multi-type dialog repliers with cost-aware LLM configurations to handle diverse conversation goals and trade-off response quality and latency. Furthermore, we introduce a continual fine-tuning scheme that utilizes real-world interactions and simulated dialogues to align the assistant's behavior with human preferred decision-making processes. Since its deployment in the DiDi application, DiMA has demonstrated exceptional performance, achieving 93% accuracy in order planning and 92% in response generation during real-world interactions. Offline experiments further validate DiMA capabilities, showing improvements of up to 70.23% in order planning and 321.27% in response generation compared to three state-of-the-art agent frameworks, while reducing latency by $0.72\times$ to $5.47\times$. These results establish DiMA as an effective, efficient, and intelligent mobile assistant for ride-hailing services. Our project is released at https://github.com/usail-hkust/DiMA and we also release the MCP service (https://mcp.didichuxing.com/api) to foster the ride-hailing research community.

DiMA: An LLM-Powered Ride-Hailing Assistant at DiDi

TL;DR

DiMA addresses the need for an LLM-powered ride-hailing assistant capable of spatiotemporal reasoning and proactive ordering in urban contexts. It introduces a spatiotemporal tool-augmented order planning module, a cost-effective dialog system with multi-type repliers and cost-aware configurations, and a continual fine-tuning loop leveraging real-world data and role-play simulations. Deployed online in DiDi since May 2024, it achieves 93% order planning accuracy and 92% response generation, with offline gains up to 70.23% and 321.27% over baselines and latency reductions. The approach demonstrates a practical, scalable path to intelligent mobile ride-hailing assistants and is released with code and access to the MCP service for community research.

Abstract

On-demand ride-hailing services like DiDi, Uber, and Lyft have transformed urban transportation, offering unmatched convenience and flexibility. In this paper, we introduce DiMA, an LLM-powered ride-hailing assistant deployed in DiDi Chuxing. Its goal is to provide seamless ride-hailing services and beyond through a natural and efficient conversational interface under dynamic and complex spatiotemporal urban contexts. To achieve this, we propose a spatiotemporal-aware order planning module that leverages external tools for precise spatiotemporal reasoning and progressive order planning. Additionally, we develop a cost-effective dialogue system that integrates multi-type dialog repliers with cost-aware LLM configurations to handle diverse conversation goals and trade-off response quality and latency. Furthermore, we introduce a continual fine-tuning scheme that utilizes real-world interactions and simulated dialogues to align the assistant's behavior with human preferred decision-making processes. Since its deployment in the DiDi application, DiMA has demonstrated exceptional performance, achieving 93% accuracy in order planning and 92% in response generation during real-world interactions. Offline experiments further validate DiMA capabilities, showing improvements of up to 70.23% in order planning and 321.27% in response generation compared to three state-of-the-art agent frameworks, while reducing latency by to . These results establish DiMA as an effective, efficient, and intelligent mobile assistant for ride-hailing services. Our project is released at https://github.com/usail-hkust/DiMA and we also release the MCP service (https://mcp.didichuxing.com/api) to foster the ride-hailing research community.

Paper Structure

This paper contains 40 sections, 1 equation, 7 figures, 7 tables.

Figures (7)

  • Figure 1: The conversational interface in DiDi Chuxing App. Given a user query issued on 2024-08-28, (a) DiMA proactively guides the user to complete necessary spatiotemporal trip information, and (b) automatically schedules a future ride-hailing order from Guangdong Museum East Gate to Terminal T2 of Guangzhou Baiyun International Airport.
  • Figure 2: Performance of existing LLMs as a ride-hailing assistant. The results are obtained based on 500 human annotated real-world ride-hailing requests.
  • Figure 3: An overview of DiMA.
  • Figure 4: API and tool lib.
  • Figure 5: Comparison of DiMA's performance over the time. We periodically (every three days) fine-tune DiMA using collected data up to the latest date.
  • ...and 2 more figures

Theorems & Definitions (2)

  • Definition 1
  • Definition 2