Evaluating Driver Readiness in Conditionally Automated Vehicles from Eye-Tracking Data and Head Pose

Mostafa Kazemi; Mahdi Rezaei; Mohsen Azarmi

Evaluating Driver Readiness in Conditionally Automated Vehicles from Eye-Tracking Data and Head Pose

Mostafa Kazemi, Mahdi Rezaei, Mohsen Azarmi

TL;DR

This study tackles the challenge of assessing driver readiness in SAE Level 3/conditionally automated vehicles by predicting a continuous readiness index from in-cabin head pose and eye-tracking cues. It uses SPIGA to extract yaw, pitch, roll, and 98 facial landmarks for frame-level features, paired with EAR, HR, and VR eye metrics, feeding into Vanilla and Bidirectional LSTM models. Ground-truth readiness is derived from human ratings over 2-second windows and interpolated to frame level, enabling regression training on the DMD dataset; the best model—a Bidirectional LSTM using both head-pose and gaze features—achieves an MAE of 0.363. This work demonstrates that a multi-modal, temporally aware approach can robustly estimate readiness, with a modular architecture that can incorporate additional driver-specific signals to improve real-world applicability. The inclusion of a ground-truth dataset and time-series validation enhances the reliability and potential integration of driver readiness monitoring into Level 3 systems, supporting safer handover decisions.

Abstract

As automated driving technology advances, the role of the driver to resume control of the vehicle in conditionally automated vehicles becomes increasingly critical. In the SAE Level 3 or partly automated vehicles, the driver needs to be available and ready to intervene when necessary. This makes it essential to evaluate their readiness accurately. This article presents a comprehensive analysis of driver readiness assessment by combining head pose features and eye-tracking data. The study explores the effectiveness of predictive models in evaluating driver readiness, addressing the challenges of dataset limitations and limited ground truth labels. Machine learning techniques, including LSTM architectures, are utilised to model driver readiness based on the Spatio-temporal status of the driver's head pose and eye gaze. The experiments in this article revealed that a Bidirectional LSTM architecture, combining both feature sets, achieves a mean absolute error of 0.363 on the DMD dataset, demonstrating superior performance in assessing driver readiness. The modular architecture of the proposed model also allows the integration of additional driver-specific features, such as steering wheel activity, enhancing its adaptability and real-world applicability.

Evaluating Driver Readiness in Conditionally Automated Vehicles from Eye-Tracking Data and Head Pose

TL;DR

Abstract

Paper Structure (27 sections, 9 equations, 14 figures, 2 tables)

This paper contains 27 sections, 9 equations, 14 figures, 2 tables.

INTRODUCTION
RELATED WORK
Vision-based driver monitoring
The driver readiness studies
DATASET AND GROUND TRUTH
Human ratings for driver readiness
Protocol for collecting ratings
Readiness index as ground truth
Qualitative analysis of readiness ratings
METHODOLOGY
Frame-level feature extraction
Head pose estimation
Eye-tracking data
Eye aspect ratio metric
Horizontal gaze ratio metric
...and 12 more sections

Figures (14)

Figure 1: Classification and positioning of gaze zones within the vehicle cabin
Figure 2: Assigned ratings and derived readiness index for a 30-second time interval.
Figure 3: Deep learning architecture for driver readiness evaluation.
Figure 4: Head pose estimation for a sample frame of DMD dataset using SPIGA model.
Figure 5: Extracted facial landmarks via SPIGA model for a sample frame.
...and 9 more figures

Evaluating Driver Readiness in Conditionally Automated Vehicles from Eye-Tracking Data and Head Pose

TL;DR

Abstract

Evaluating Driver Readiness in Conditionally Automated Vehicles from Eye-Tracking Data and Head Pose

Authors

TL;DR

Abstract

Table of Contents

Figures (14)