Table of Contents
Fetching ...

What Matters in Autonomous Driving Anomaly Detection: A Weakly Supervised Horizon

Utkarsh Tiwari, Snehashis Majhi, Michal Balazia, François Brémond

TL;DR

The paper tackles video anomaly detection for ego-centric autonomous driving under weak supervision, where only video-level labels are available. It redefines the DoTA dataset as WS-DoTA to enable weakly-supervised training and introduces a Feature Transformation Block (FTB) to inject motion and spatial cues into CLIP-based features. By evaluating four state-of-the-art weakly-supervised VAD methods (RTFM, MGFN, UR-DMU, OE-CTST) with FTB, the study demonstrates significant performance gains, especially with the M3 variant that fuses spatial semantics with sharp motion cues. The work provides a practical dataset and methodological insights that can propel weakly-supervised VAD research for autonomous driving, with code and dataset release planned to facilitate reproducibility and benchmarking.

Abstract

Video anomaly detection (VAD) in autonomous driving scenario is an important task, however it involves several challenges due to the ego-centric views and moving camera. Due to this, it remains largely under-explored. While recent developments in weakly-supervised VAD methods have shown remarkable progress in detecting critical real-world anomalies in static camera scenario, the development and validation of such methods are yet to be explored for moving camera VAD. This is mainly due to existing datasets like DoTA not following training pre-conditions of weakly-supervised learning. In this paper, we aim to promote weakly-supervised method development for autonomous driving VAD. We reorganize the DoTA dataset and aim to validate recent powerful weakly-supervised VAD methods on moving camera scenarios. Further, we provide a detailed analysis of what modifications on state-of-the-art methods can significantly improve the detection performance. Towards this, we propose a "feature transformation block" and through experimentation we show that our propositions can empower existing weakly-supervised VAD methods significantly in improving the VAD in autonomous driving. Our codes/dataset/demo will be released at github.com/ut21/WSAD-Driving

What Matters in Autonomous Driving Anomaly Detection: A Weakly Supervised Horizon

TL;DR

The paper tackles video anomaly detection for ego-centric autonomous driving under weak supervision, where only video-level labels are available. It redefines the DoTA dataset as WS-DoTA to enable weakly-supervised training and introduces a Feature Transformation Block (FTB) to inject motion and spatial cues into CLIP-based features. By evaluating four state-of-the-art weakly-supervised VAD methods (RTFM, MGFN, UR-DMU, OE-CTST) with FTB, the study demonstrates significant performance gains, especially with the M3 variant that fuses spatial semantics with sharp motion cues. The work provides a practical dataset and methodological insights that can propel weakly-supervised VAD research for autonomous driving, with code and dataset release planned to facilitate reproducibility and benchmarking.

Abstract

Video anomaly detection (VAD) in autonomous driving scenario is an important task, however it involves several challenges due to the ego-centric views and moving camera. Due to this, it remains largely under-explored. While recent developments in weakly-supervised VAD methods have shown remarkable progress in detecting critical real-world anomalies in static camera scenario, the development and validation of such methods are yet to be explored for moving camera VAD. This is mainly due to existing datasets like DoTA not following training pre-conditions of weakly-supervised learning. In this paper, we aim to promote weakly-supervised method development for autonomous driving VAD. We reorganize the DoTA dataset and aim to validate recent powerful weakly-supervised VAD methods on moving camera scenarios. Further, we provide a detailed analysis of what modifications on state-of-the-art methods can significantly improve the detection performance. Towards this, we propose a "feature transformation block" and through experimentation we show that our propositions can empower existing weakly-supervised VAD methods significantly in improving the VAD in autonomous driving. Our codes/dataset/demo will be released at github.com/ut21/WSAD-Driving
Paper Structure (18 sections, 7 figures, 2 tables)

This paper contains 18 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Our Framework for experimental analysing of Weakly-supervised video anomaly detection methods on autonomous driving videos. Here, we integrate a feature transformation block (FTB) to improve state-of-the-art methods performance.
  • Figure 2: Visualization of Ground truth vs. prediction heatmaps for SoTAs in with different feature maps obtained from feature Transformation block (FTB). We portray such visualization for three challenging videos. More visualization can be found in appendix.
  • Figure 3: MGFN.AAAI23MGFN Framework
  • Figure 4: OECTSToectst Framework
  • Figure 5: RTFMiccv21 Framework
  • ...and 2 more figures