Table of Contents
Fetching ...

Visionary Co-Driver: Enhancing Driver Perception of Potential Risks with LLM and HUD

Wei Xiang, Ziyue Lei, Jie Wang, Yingying Huang, Qi Zheng, Tianyi Zhang, An Zhao, Lingyun Sun

TL;DR

Visionary Co-Driver (VCD) tackles the challenge of perceiving non-collision roadside risks by coupling video-driven road-scene processing with large language models to reason about pedestrian intentions, then presenting adaptive, gaze-aware warnings on a HUD. The system pipelines road-scene perception (YOLOX+ByteTrack, SAM, DPT) with GPT-3.5-16k-0613-based risk analysis to output structured risk signals, which are visually conveyed through an eye-tracking-driven HUD design. A controlled user study (n=36 valid) shows improved pedestrian risk recognition and favorable user experience without added cognitive burden, supporting LLM-assisted co-driver collaboration while underscoring the need for safety safeguards and broader real-world validation. The work demonstrates the practical potential of LLM-enabled cognitive support in human-vehicle systems, highlighting design considerations for reliability, latency, and deployment in diverse driving contexts.

Abstract

Drivers' perception of risky situations has always been a challenge in driving. Existing risk-detection methods excel at identifying collisions but face challenges in assessing the behavior of road users in non-collision situations. This paper introduces Visionary Co-Driver, a system that leverages large language models to identify non-collision roadside risks and alert drivers based on their eye movements. Specifically, the system combines video processing algorithms and LLMs to identify potentially risky road users. These risks are dynamically indicated on an adaptive heads-up display interface to enhance drivers' attention. A user study with 41 drivers confirms that Visionary Co-Driver improves drivers' risk perception and supports their recognition of roadside risks.

Visionary Co-Driver: Enhancing Driver Perception of Potential Risks with LLM and HUD

TL;DR

Visionary Co-Driver (VCD) tackles the challenge of perceiving non-collision roadside risks by coupling video-driven road-scene processing with large language models to reason about pedestrian intentions, then presenting adaptive, gaze-aware warnings on a HUD. The system pipelines road-scene perception (YOLOX+ByteTrack, SAM, DPT) with GPT-3.5-16k-0613-based risk analysis to output structured risk signals, which are visually conveyed through an eye-tracking-driven HUD design. A controlled user study (n=36 valid) shows improved pedestrian risk recognition and favorable user experience without added cognitive burden, supporting LLM-assisted co-driver collaboration while underscoring the need for safety safeguards and broader real-world validation. The work demonstrates the practical potential of LLM-enabled cognitive support in human-vehicle systems, highlighting design considerations for reliability, latency, and deployment in diverse driving contexts.

Abstract

Drivers' perception of risky situations has always been a challenge in driving. Existing risk-detection methods excel at identifying collisions but face challenges in assessing the behavior of road users in non-collision situations. This paper introduces Visionary Co-Driver, a system that leverages large language models to identify non-collision roadside risks and alert drivers based on their eye movements. Specifically, the system combines video processing algorithms and LLMs to identify potentially risky road users. These risks are dynamically indicated on an adaptive heads-up display interface to enhance drivers' attention. A user study with 41 drivers confirms that Visionary Co-Driver improves drivers' risk perception and supports their recognition of roadside risks.

Paper Structure

This paper contains 50 sections, 10 figures, 5 tables.

Figures (10)

  • Figure 1: A typical road scene driver perceives in a cross road, with color-masked separation of "on road" and "roadside" area. On the left and far front, a truck, a car and a standing pedestrian is labeled with "On Road Risk". On the left and right, several standing pedestrian and a walking pedestrian is label with "Roadside Risk".
  • Figure 2: System design of VCD
  • Figure 3: Typical potential roadside risk caused by pedestrian
  • Figure 5: Reasoning process of VCD
  • Figure 6: HUD interface
  • ...and 5 more figures