Table of Contents
Fetching ...

Dual-AEB: Synergizing Rule-Based and Multimodal Large Language Models for Effective Emergency Braking

Wei Zhang, Pengfei Li, Junli Wang, Bingchuan Sun, Qihao Jin, Guangjun Bao, Shibo Rui, Yang Yu, Wenchao Ding, Peng Li, Yilun Chen

TL;DR

Dual-AEB, a system combines an advanced multimodal large language model (MLLM) for comprehensive scene understanding and a conventional rule-based rapid AEB to ensure quick response times, is proposed, the first method to incorporate MLLMs within AEB systems.

Abstract

Automatic Emergency Braking (AEB) systems are a crucial component in ensuring the safety of passengers in autonomous vehicles. Conventional AEB systems primarily rely on closed-set perception modules to recognize traffic conditions and assess collision risks. To enhance the adaptability of AEB systems in open scenarios, we propose Dual-AEB, a system combines an advanced multimodal large language model (MLLM) for comprehensive scene understanding and a conventional rule-based rapid AEB to ensure quick response times. To the best of our knowledge, Dual-AEB is the first method to incorporate MLLMs within AEB systems. Through extensive experimentation, we have validated the effectiveness of our method. The source code will be available at https://github.com/ChipsICU/Dual-AEB.

Dual-AEB: Synergizing Rule-Based and Multimodal Large Language Models for Effective Emergency Braking

TL;DR

Dual-AEB, a system combines an advanced multimodal large language model (MLLM) for comprehensive scene understanding and a conventional rule-based rapid AEB to ensure quick response times, is proposed, the first method to incorporate MLLMs within AEB systems.

Abstract

Automatic Emergency Braking (AEB) systems are a crucial component in ensuring the safety of passengers in autonomous vehicles. Conventional AEB systems primarily rely on closed-set perception modules to recognize traffic conditions and assess collision risks. To enhance the adaptability of AEB systems in open scenarios, we propose Dual-AEB, a system combines an advanced multimodal large language model (MLLM) for comprehensive scene understanding and a conventional rule-based rapid AEB to ensure quick response times. To the best of our knowledge, Dual-AEB is the first method to incorporate MLLMs within AEB systems. Through extensive experimentation, we have validated the effectiveness of our method. The source code will be available at https://github.com/ChipsICU/Dual-AEB.

Paper Structure

This paper contains 19 sections, 2 equations, 4 figures, 6 tables, 1 algorithm.

Figures (4)

  • Figure 1: Conventional AEB systems tend to fail in these situations: (a) when early detection of a pedestrian is required to brake in advance and avoid danger, and (b) when incorrect perception triggers AEB unnecessarily. Such scenarios are challenging for conventional AEB methods.
  • Figure 2: Overview of our method. Dual-AEB (a) includes both the quick (rule-based AEB) and slow (MLLM-powered AEB) modules. After receiving information from autonomous driving models (AD-Models), the braking signal can either be directly output by (c), as indicated by dashed lines, or sent to (b), as indicated by solid lines, where the MLLM-powered AEB evaluates and decides whether to confirm or adjust.
  • Figure 3: Qualitative analysis: MLLM-powered AEB provided reasonable descriptions for different meta actions. The left side () represents the ground truth results, while the right side () represents the predicted results.
  • Figure 4: Qualitative analysis: Dual-AEB provides reasonable descriptions in our in-house dataset. represents the scenarios description task, represents the objects description task, and represents the decision making task.