Table of Contents
Fetching ...

C-DIRA: Computationally Efficient Dynamic ROI Routing and Domain-Invariant Adversarial Learning for Lightweight Driver Behavior Recognition

Keito Inoshita

TL;DR

C-DIRA tackles real-time driver distraction recognition on edge devices by integrating a lightweight dual-path architecture that fuses global context with saliency-driven ROI cues. It introduces dynamic ROI routing to selectively allocate computation to difficult samples and uses pseudo-domain labeling with adversarial learning to achieve domain-invariant representations, enhancing generalization to unseen drivers and environments. Empirical results on the State Farm dataset show competitive accuracy with significantly fewer FLOPs and latency, plus improved robustness under visual degradations and stronger domain generalization, validating the approach's practicality for edge deployment. The work highlights how targeted local feature extraction and principled domain suppression can reconcile efficiency and performance in visually demanding driver monitoring tasks.

Abstract

Driver distraction behavior recognition using in-vehicle cameras demands real-time inference on edge devices. However, lightweight models often fail to capture fine-grained behavioral cues, resulting in reduced performance on unseen drivers or under varying conditions. ROI-based methods also increase computational cost, making it difficult to balance efficiency and accuracy. This work addresses the need for a lightweight architecture that overcomes these constraints. We propose Computationally efficient Dynamic region of Interest Routing and domain-invariant Adversarial learning for lightweight driver behavior recognition (C-DIRA). The framework combines saliency-driven Top-K ROI pooling and fused classification for local feature extraction and integration. Dynamic ROI routing enables selective computation by applying ROI inference only to high difficulty data samples. Moreover, pseudo-domain labeling and adversarial learning are used to learn domain-invariant features robust to driver and background variation. Experiments on the State Farm Distracted Driver Detection Dataset show that C-DIRA maintains high accuracy with significantly fewer FLOPs and lower latency than prior lightweight models. It also demonstrates robustness under visual degradation such as blur and low-light, and stable performance across unseen domains. These results confirm C-DIRA's effectiveness in achieving compactness, efficiency, and generalization.

C-DIRA: Computationally Efficient Dynamic ROI Routing and Domain-Invariant Adversarial Learning for Lightweight Driver Behavior Recognition

TL;DR

C-DIRA tackles real-time driver distraction recognition on edge devices by integrating a lightweight dual-path architecture that fuses global context with saliency-driven ROI cues. It introduces dynamic ROI routing to selectively allocate computation to difficult samples and uses pseudo-domain labeling with adversarial learning to achieve domain-invariant representations, enhancing generalization to unseen drivers and environments. Empirical results on the State Farm dataset show competitive accuracy with significantly fewer FLOPs and latency, plus improved robustness under visual degradations and stronger domain generalization, validating the approach's practicality for edge deployment. The work highlights how targeted local feature extraction and principled domain suppression can reconcile efficiency and performance in visually demanding driver monitoring tasks.

Abstract

Driver distraction behavior recognition using in-vehicle cameras demands real-time inference on edge devices. However, lightweight models often fail to capture fine-grained behavioral cues, resulting in reduced performance on unseen drivers or under varying conditions. ROI-based methods also increase computational cost, making it difficult to balance efficiency and accuracy. This work addresses the need for a lightweight architecture that overcomes these constraints. We propose Computationally efficient Dynamic region of Interest Routing and domain-invariant Adversarial learning for lightweight driver behavior recognition (C-DIRA). The framework combines saliency-driven Top-K ROI pooling and fused classification for local feature extraction and integration. Dynamic ROI routing enables selective computation by applying ROI inference only to high difficulty data samples. Moreover, pseudo-domain labeling and adversarial learning are used to learn domain-invariant features robust to driver and background variation. Experiments on the State Farm Distracted Driver Detection Dataset show that C-DIRA maintains high accuracy with significantly fewer FLOPs and lower latency than prior lightweight models. It also demonstrates robustness under visual degradation such as blur and low-light, and stable performance across unseen domains. These results confirm C-DIRA's effectiveness in achieving compactness, efficiency, and generalization.

Paper Structure

This paper contains 28 sections, 33 equations, 9 figures, 4 tables, 3 algorithms.

Figures (9)

  • Figure 1: Overall framework of C-DIRA.
  • Figure 2: ROI-integrated architecture of C-DIRA.
  • Figure 3: Architecture of the dynamic ROI routing mechanism in C-DIRA.
  • Figure 4: F1 and ROI usage ratio under different routing thresholds $\tau$.
  • Figure 5: Class-wise ROI usage ratio on the test set ($\tau=0.9$).
  • ...and 4 more figures