Table of Contents
Fetching ...

LMMCoDrive: Cooperative Driving with Large Multimodal Model

Haichao Liu, Ruoyu Yao, Zhenmin Huang, Shaojie Shen, Jun Ma

TL;DR

LMMCoDrive is introduced, a novel cooperative driving framework that leverages a Large Multimodal Model (LMM) to enhance traffic efficiency in dynamic urban environments and marks a substantial stride towards achieving practical, efficient, and safe AMoD systems that are poised to revolutionize urban transportation.

Abstract

To address the intricate challenges of decentralized cooperative scheduling and motion planning in Autonomous Mobility-on-Demand (AMoD) systems, this paper introduces LMMCoDrive, a novel cooperative driving framework that leverages a Large Multimodal Model (LMM) to enhance traffic efficiency in dynamic urban environments. This framework seamlessly integrates scheduling and motion planning processes to ensure the effective operation of Cooperative Autonomous Vehicles (CAVs). The spatial relationship between CAVs and passenger requests is abstracted into a Bird's-Eye View (BEV) to fully exploit the potential of the LMM. Besides, trajectories are cautiously refined for each CAV while ensuring collision avoidance through safety constraints. A decentralized optimization strategy, facilitated by the Alternating Direction Method of Multipliers (ADMM) within the LMM framework, is proposed to drive the graph evolution of CAVs. Simulation results demonstrate the pivotal role and significant impact of LMM in optimizing CAV scheduling and enhancing decentralized cooperative optimization process for each vehicle. This marks a substantial stride towards achieving practical, efficient, and safe AMoD systems that are poised to revolutionize urban transportation. The code is available at https://github.com/henryhcliu/LMMCoDrive.

LMMCoDrive: Cooperative Driving with Large Multimodal Model

TL;DR

LMMCoDrive is introduced, a novel cooperative driving framework that leverages a Large Multimodal Model (LMM) to enhance traffic efficiency in dynamic urban environments and marks a substantial stride towards achieving practical, efficient, and safe AMoD systems that are poised to revolutionize urban transportation.

Abstract

To address the intricate challenges of decentralized cooperative scheduling and motion planning in Autonomous Mobility-on-Demand (AMoD) systems, this paper introduces LMMCoDrive, a novel cooperative driving framework that leverages a Large Multimodal Model (LMM) to enhance traffic efficiency in dynamic urban environments. This framework seamlessly integrates scheduling and motion planning processes to ensure the effective operation of Cooperative Autonomous Vehicles (CAVs). The spatial relationship between CAVs and passenger requests is abstracted into a Bird's-Eye View (BEV) to fully exploit the potential of the LMM. Besides, trajectories are cautiously refined for each CAV while ensuring collision avoidance through safety constraints. A decentralized optimization strategy, facilitated by the Alternating Direction Method of Multipliers (ADMM) within the LMM framework, is proposed to drive the graph evolution of CAVs. Simulation results demonstrate the pivotal role and significant impact of LMM in optimizing CAV scheduling and enhancing decentralized cooperative optimization process for each vehicle. This marks a substantial stride towards achieving practical, efficient, and safe AMoD systems that are poised to revolutionize urban transportation. The code is available at https://github.com/henryhcliu/LMMCoDrive.
Paper Structure (18 sections, 8 equations, 5 figures, 1 table, 2 algorithms)

This paper contains 18 sections, 8 equations, 5 figures, 1 table, 2 algorithms.

Figures (5)

  • Figure 1: Demonstration of the autonomous mobility-on-demand system in an urban scenario. The red squares denote the requests from passengers, while the free vehicles are supposed to be scheduled by the LMM. The abstracted bird-eye-view graphics are generated along with the supplementary textual information to be sent to the LMM for multiple decisions.
  • Figure 2: Overall architecture of LMMCoDrive. It is composed of a multimodal integrator, a memory and reflection module, a retrieval module for similar memory and reflections, and a decentralized cooperative driving module. CMP means cooperative motion planning for the CAVs in the AMoD system.
  • Figure 3: Illustration of the retrieval for LMMCoDrive. A BEV image for similarity query is embedded with the pre-trained swin-transformer, before the pair-wise similarity quantification with the memory vectors. The memory samples of Top-K similarities are retrieved to support the online reasoning.
  • Figure 4: The reasoning demonstration of the AMoD system utilizing our proposed LMMCoDrive framework: Green rectangles represent free vehicles, whereas yellow rectangles indicate vehicles occupied by passengers. Red squares signify pending requests from passengers within the AMoD system. The corresponding responses from the LMM are also revealed alongside the BEVs.
  • Figure 5: The computation time for each episode of the receding horizon for cooperative motion planning. For the first column of boxes, the graph of the connection of the vehicles is evolved by LMM, while the second column is the result of the Manhattan distance-based heuristic method.