Table of Contents
Fetching ...

Elderly Activity Recognition in the Wild: Results from the EAR Challenge

Anh-Kiet Duong

TL;DR

The paper tackles elderly action recognition in unconstrained video by the EAR Challenge, aiming to classify six ADL categories. It leverages Temporal Shift Module (TSM) with a ResNext50 backbone, fine-tuned via transfer learning on elderly-specific data and enhanced by careful cross-dataset data curation to improve generalization. Results show initial public leaderboard accuracy of 0.81455 (later 0.84272 after full training) and private leaderboard accuracy of 0.81759 (later 0.85051), indicating strong performance and generalization relative to peers. The work demonstrates that combining efficient temporal modeling with diverse data sources can robustly recognize elderly activities in real-world settings, with implications for real-world care and monitoring systems.

Abstract

This paper presents our solution for the Elderly Action Recognition (EAR) Challenge, part of the Computer Vision for Smalls Workshop at WACV 2025. The competition focuses on recognizing Activities of Daily Living (ADLs) performed by the elderly, covering six action categories with a diverse dataset. Our approach builds upon a state-of-the-art action recognition model, fine-tuned through transfer learning on elderly-specific datasets to enhance adaptability. To improve generalization and mitigate dataset bias, we carefully curated training data from multiple publicly available sources and applied targeted pre-processing techniques. Our solution currently achieves 0.81455 accuracy on the public leaderboard, highlighting its effectiveness in classifying elderly activities. Source codes are publicly available at https://github.com/ffyyytt/EAR-WACV25-DAKiet-TSM.

Elderly Activity Recognition in the Wild: Results from the EAR Challenge

TL;DR

The paper tackles elderly action recognition in unconstrained video by the EAR Challenge, aiming to classify six ADL categories. It leverages Temporal Shift Module (TSM) with a ResNext50 backbone, fine-tuned via transfer learning on elderly-specific data and enhanced by careful cross-dataset data curation to improve generalization. Results show initial public leaderboard accuracy of 0.81455 (later 0.84272 after full training) and private leaderboard accuracy of 0.81759 (later 0.85051), indicating strong performance and generalization relative to peers. The work demonstrates that combining efficient temporal modeling with diverse data sources can robustly recognize elderly activities in real-world settings, with implications for real-world care and monitoring systems.

Abstract

This paper presents our solution for the Elderly Action Recognition (EAR) Challenge, part of the Computer Vision for Smalls Workshop at WACV 2025. The competition focuses on recognizing Activities of Daily Living (ADLs) performed by the elderly, covering six action categories with a diverse dataset. Our approach builds upon a state-of-the-art action recognition model, fine-tuned through transfer learning on elderly-specific datasets to enhance adaptability. To improve generalization and mitigate dataset bias, we carefully curated training data from multiple publicly available sources and applied targeted pre-processing techniques. Our solution currently achieves 0.81455 accuracy on the public leaderboard, highlighting its effectiveness in classifying elderly activities. Source codes are publicly available at https://github.com/ffyyytt/EAR-WACV25-DAKiet-TSM.

Paper Structure

This paper contains 10 sections, 1 table.