Table of Contents
Fetching ...

Driving behavior recognition via self-discovery learning

Yilin Wang

TL;DR

This paper presents a meta-modelling framework that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually cataloging and cataloging driving behavior patterns in order to facilitate autonomous driving systems.

Abstract

Autonomous driving systems require a deep understanding of human driving behaviors to achieve higher intelligence and safety.Despite advancements in deep learning, challenges such as long-tail distribution due to scarce samples and confusion from similar behaviors hinder effective driving behavior detection.Existing methods often fail to address sample confusion adequately, as datasets frequently contain ambiguous samples that obscure unique semantic information.

Driving behavior recognition via self-discovery learning

TL;DR

This paper presents a meta-modelling framework that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually cataloging and cataloging driving behavior patterns in order to facilitate autonomous driving systems.

Abstract

Autonomous driving systems require a deep understanding of human driving behaviors to achieve higher intelligence and safety.Despite advancements in deep learning, challenges such as long-tail distribution due to scarce samples and confusion from similar behaviors hinder effective driving behavior detection.Existing methods often fail to address sample confusion adequately, as datasets frequently contain ambiguous samples that obscure unique semantic information.

Paper Structure

This paper contains 16 sections, 9 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: The confused samples in HDD dataset. Red denotes specific information, whereas green signifies confusing information. “Left branch change” indicates a vehicle exits the main road to take a left branch or fork, "Left lane change" indicates a vehicle exits the main road to take a left branch or fork, "Intersection passing" indicates driving through an intersection where two or more roads meet, "Crosswalk passing" indicates driving through a designated pedestrian crossing area.
  • Figure 2: The proposed framework consists of three parts: (1) spatial-temporal Transformer for feature extraction, (2) temporal discovery module for feature representation $\mathcal{F}$ reconstruction using decoder $D(\cdot; \varphi)$, and (3) sample discovery module to reduce scene confusion with dynamically updated meta-variables and input feature weights.
  • Figure 3: Online detection examples on the HDD dataset comparing our proposed SDL method with the baseline E2E-load. Red boxes indicate retrieval frames, red text denotes ground truth, green text represents E2E predictions, and blue text shows SDL results. Our method effectively distinguishes similar behaviors and improves accuracy in scenarios with short-duration actions, demonstrating superior spatial-temporal perception and hidden cue capture.
  • Figure 4: Visualization of frame-level features using t-SNE with varying $\alpha$ values in Equation \ref{['equ: feature_update']}. As $\alpha$ increases, inter-class distances expand, enhancing feature discriminative power.
  • Figure 5: Visualization of driving behavior changes and corresponding uncertainty curves over time. Consistent behavior shows high, stable uncertainty, while behavior changes cause sharp drops in uncertainty, indicating boundary frames between behaviors.