MARAuder's Map: Motion-Aware Real-time Activity Recognition with Layout-Based Trajectories
Zishuai Liu, Weihang You, Jin Lu, Fei Dou
TL;DR
This paper presents MARAuder’s Map, a real-time activity recognition framework for smart homes that operates on unsegmented ambient sensor streams by projecting sensor activations onto the floorplan to form trajectory-like image sequences. A CNN encodes spatial layouts, while a learnable time embedding captures hour/day granularity, and an attention-enabled LSTM models temporal dependencies to robustly classify activities within cross-activity windows. The approach is validated on three CASAS datasets (Milan, Kyoto7, Aruba), showing superior performance over strong baselines and demonstrating resilience to temporal ambiguity and multi-activity windows. The results highlight the value of explicit layout-grounded representations, structured temporal cues, and attention in enabling accurate, real-time HAR in ambient-sensor settings with practical deployment potential.
Abstract
Ambient sensor-based human activity recognition (HAR) in smart homes remains challenging due to the need for real-time inference, spatially grounded reasoning, and context-aware temporal modeling. Existing approaches often rely on pre-segmented, within-activity data and overlook the physical layout of the environment, limiting their robustness in continuous, real-world deployments. In this paper, we propose MARAuder's Map, a novel framework for real-time activity recognition from raw, unsegmented sensor streams. Our method projects sensor activations onto the physical floorplan to generate trajectory-aware, image-like sequences that capture the spatial flow of human movement. These representations are processed by a hybrid deep learning model that jointly captures spatial structure and temporal dependencies. To enhance temporal awareness, we introduce a learnable time embedding module that encodes contextual cues such as hour-of-day and day-of-week. Additionally, an attention-based encoder selectively focuses on informative segments within each observation window, enabling accurate recognition even under cross-activity transitions and temporal ambiguity. Extensive experiments on multiple real-world smart home datasets demonstrate that our method outperforms strong baselines, offering a practical solution for real-time HAR in ambient sensor environments.
