Table of Contents
Fetching ...

Thermal Imaging-based Real-time Fall Detection using Motion Flow and Attention-enhanced Convolutional Recurrent Architecture

Christopher Silver, Thangarajah Akilan

TL;DR

This work tackles privacy-preserving, real-time fall detection using thermal imaging by proposing a BiConvLSTM-based architecture augmented with spatial, temporal, feature-based, self-, and general-attention, with optional motion-flow inputs. Through an extensive ablation研究 across hundreds of variants, the BiConvLSTM + Layer-specific Attention (M2) model achieves state-of-the-art TSF performance with an ROC-AUC of $99.7\%$ and strong generalization on the new TF-66 benchmark ($AUC = 97.4\%$), while maintaining real-time feasibility via cloud-based deployment. The study also demonstrates the trade-offs between accuracy, latency, and compute, showing motion-flow enhancements yield gains at the cost of latency, and emphasizes TF-66 as a more representative, privacy-preserving benchmark for deployment-readiness. Collectively, the results establish a new standard for privacy-preserving thermal fall detection and provide a practical foundation for deployable eldercare AI tools that protect dignity and safety.

Abstract

Falls among seniors are a major public health issue. Existing solutions using wearable sensors, ambient sensors, and RGB-based vision systems face challenges in reliability, user compliance, and practicality. Studies indicate that stakeholders, such as older adults and eldercare facilities, prefer non-wearable, passive, privacy-preserving, and real-time fall detection systems that require no user interaction. This study proposes an advanced thermal fall detection method using a Bidirectional Convolutional Long Short-Term Memory (BiConvLSTM) model, enhanced with spatial, temporal, feature, self, and general attention mechanisms. Through systematic experimentation across hundreds of model variations exploring the integration of attention mechanisms, recurrent modules, and motion flow, we identified top-performing architectures. Among them, BiConvLSTM achieved state-of-the-art performance with a ROC-AUC of $99.7\%$ on the TSF dataset and demonstrated robust results on TF-66, a newly emerged, diverse, and privacy-preserving benchmark. These results highlight the generalizability and practicality of the proposed model, setting new standards for thermal fall detection and paving the way toward deployable, high-performance solutions.

Thermal Imaging-based Real-time Fall Detection using Motion Flow and Attention-enhanced Convolutional Recurrent Architecture

TL;DR

This work tackles privacy-preserving, real-time fall detection using thermal imaging by proposing a BiConvLSTM-based architecture augmented with spatial, temporal, feature-based, self-, and general-attention, with optional motion-flow inputs. Through an extensive ablation研究 across hundreds of variants, the BiConvLSTM + Layer-specific Attention (M2) model achieves state-of-the-art TSF performance with an ROC-AUC of and strong generalization on the new TF-66 benchmark (), while maintaining real-time feasibility via cloud-based deployment. The study also demonstrates the trade-offs between accuracy, latency, and compute, showing motion-flow enhancements yield gains at the cost of latency, and emphasizes TF-66 as a more representative, privacy-preserving benchmark for deployment-readiness. Collectively, the results establish a new standard for privacy-preserving thermal fall detection and provide a practical foundation for deployable eldercare AI tools that protect dignity and safety.

Abstract

Falls among seniors are a major public health issue. Existing solutions using wearable sensors, ambient sensors, and RGB-based vision systems face challenges in reliability, user compliance, and practicality. Studies indicate that stakeholders, such as older adults and eldercare facilities, prefer non-wearable, passive, privacy-preserving, and real-time fall detection systems that require no user interaction. This study proposes an advanced thermal fall detection method using a Bidirectional Convolutional Long Short-Term Memory (BiConvLSTM) model, enhanced with spatial, temporal, feature, self, and general attention mechanisms. Through systematic experimentation across hundreds of model variations exploring the integration of attention mechanisms, recurrent modules, and motion flow, we identified top-performing architectures. Among them, BiConvLSTM achieved state-of-the-art performance with a ROC-AUC of on the TSF dataset and demonstrated robust results on TF-66, a newly emerged, diverse, and privacy-preserving benchmark. These results highlight the generalizability and practicality of the proposed model, setting new standards for thermal fall detection and paving the way toward deployable, high-performance solutions.

Paper Structure

This paper contains 22 sections, 6 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: An illustration of the proposed attention-enhanced 3D convolutional recurrent architecture. This model extends a vanilla 3D CNN (cf. \ref{['tab:Arch:vanilla']}) by integrating attention modules, including a spatial attention layer, temporal attention layer, and feature attention layer, and a BiConvLSTM2D module.
  • Figure 2: Training performance of the BiConvLSTM + Layer-specific Attention (M2) model on the TF-66 (left column) and TSF (right column) datasets. The first row shows loss curves, the second row presents AUC and accuracy metrics, and the third row depicts F1 score and MCC trends.
  • Figure 3: Eight consecutive frames from 01-Fall-04 starting at frame 35 from the TF-66 Dataset.