Layout Agnostic Human Activity Recognition in Smart Homes through Textual Descriptions Of Sensor Triggers (TDOST)
Megha Thukral, Sourish Gunesh Dhekane, Shruthi K. Hiremath, Harish Haresamudram, Thomas Ploetz
TL;DR
This work tackles the challenge of deploying HAR systems across smart homes with varying floor plans and sensor layouts by introducing TDOST, a framework that converts raw sensor triggers into contextual textual descriptions. By leveraging pre-trained language-model embeddings and a frozen inference pipeline, TDOST enables layout-agnostic transfer of activity recognizers from a labeled source home to unseen target homes without collecting target data. The paper provides a systematic comparison of TDOST variants (Basic, Temporal, LLM, and LLM+Temporal) and demonstrates substantial cross-dataset gains on CASAS benchmarks, with Sentence-T5 encoders and ConvBi-LSTM classifiers delivering the strongest transfers. It also discusses explainability, maintenance for life-long deployments, and potential extensions to multi-source training and few-shot activity transfer, highlighting practical implications for scalable smart-home HAR deployment.
Abstract
Human activity recognition (HAR) using ambient sensors in smart homes has numerous applications for human healthcare and wellness. However, building general-purpose HAR models that can be deployed to new smart home environments requires a significant amount of annotated sensor data and training overhead. Most smart homes vary significantly in their layouts, i.e., floor plans and the specifics of sensors embedded, resulting in low generalizability of HAR models trained for specific homes. We address this limitation by introducing a novel, layout-agnostic modeling approach for HAR systems in smart homes that utilizes the transferrable representational capacity of natural language descriptions of raw sensor data. To this end, we generate Textual Descriptions Of Sensor Triggers (TDOST) that encapsulate the surrounding trigger conditions and provide cues for underlying activities to the activity recognition models. Leveraging textual embeddings, rather than raw sensor data, we create activity recognition systems that predict standard activities across homes without either (re-)training or adaptation on target homes. Through an extensive evaluation, we demonstrate the effectiveness of TDOST-based models in unseen smart homes through experiments on benchmarked CASAS datasets. Furthermore, we conduct a detailed analysis of how the individual components of our approach affect downstream activity recognition performance.
