Towards LLM-Powered Ambient Sensor Based Multi-Person Human Activity Recognition
Xi Chen, Julien Cumin, Fano Ramparany, Dominique Vaufreydaz
TL;DR
The paper tackles multi-person ambient sensor HAR (AHAR) by proposing LAHAR, an LLM-powered two-stage framework that converts sensor events into text, separates subjects, and generates action-level descriptions followed by activity-level reasoning to produce per-subject timelines. It leverages prompt engineering, in-context learning, and contextual prompts to address data scarcity, generalization, and explainability, mapping sensor sequences $S_T$ to activity timelines $A_T$ through $e_t=\langle t,s,c\rangle$. Evaluated on the ARAS dataset, LAHAR achieves high-temporal-resolution results comparable to state-of-the-art methods and demonstrates robust multi-subject separation, with qualitative demonstrations of per-subject activity descriptions. The approach highlights the potential of LLMs for explainable, privacy-preserving ambient HAR and suggests directions for broader LLM adoption and enhanced conversational explainability in real-world settings.
Abstract
Human Activity Recognition (HAR) is one of the central problems in fields such as healthcare, elderly care, and security at home. However, traditional HAR approaches face challenges including data scarcity, difficulties in model generalization, and the complexity of recognizing activities in multi-person scenarios. This paper proposes a system framework called LAHAR, based on large language models. Utilizing prompt engineering techniques, LAHAR addresses HAR in multi-person scenarios by enabling subject separation and action-level descriptions of events occurring in the environment. We validated our approach on the ARAS dataset, and the results demonstrate that LAHAR achieves comparable accuracy to the state-of-the-art method at higher resolutions and maintains robustness in multi-person scenarios.
