OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition

Shihao Cheng; Jinlu Zhang; Yue Liu; Zhigang Tu

OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition

Shihao Cheng, Jinlu Zhang, Yue Liu, Zhigang Tu

TL;DR

OwlSight addresses the challenge of recognizing actions in dark videos by integrating illumination information throughout the training process with three modules: Time-Consistency Module (TCM), Luminance Adaptation Module (LAM), and Reflect Augmentation Module (RAM). The approach, trained end-to-end, dynamically adapts brightness, preserves temporal coherence, and leverages dual illumination pathways to maximize light usage. A large-scale Dark-101 dataset is introduced to support robust learning in diverse, very low-light scenarios, and OwlSight demonstrates state-of-the-art results across four benchmarks, notably outperforming prior methods on ARID1.5 and Dark-101. The work highlights the practical impact of holistic illumination-aware learning for real-world dark-environment video analysis, with strong gains from temporal consistency and adaptive illumination mechanisms.

Abstract

Human action recognition in low-light environments is crucial for various real-world applications. However, the existing approaches overlook the full utilization of brightness information throughout the training phase, leading to suboptimal performance. To address this limitation, we propose OwlSight, a biomimetic-inspired framework with whole-stage illumination enhancement to interact with action classification for accurate dark video human action recognition. Specifically, OwlSight incorporates a Time-Consistency Module (TCM) to capture shallow spatiotemporal features meanwhile maintaining temporal coherence, which are then processed by a Luminance Adaptation Module (LAM) to dynamically adjust the brightness based on the input luminance distribution. Furthermore, a Reflect Augmentation Module (RAM) is presented to maximize illumination utilization and simultaneously enhance action recognition via two interactive paths. Additionally, we build Dark-101, a large-scale dataset comprising 18,310 dark videos across 101 action categories, significantly surpassing existing datasets (e.g., ARID1.5 and Dark-48) in scale and diversity. Extensive experiments demonstrate that the proposed OwlSight achieves state-of-the-art performance across four low-light action recognition benchmarks. Notably, it outperforms previous best approaches by 5.36% on ARID1.5 and 1.72% on Dark-101, highlighting its effectiveness in challenging dark environments.

OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition

TL;DR

Abstract

OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)