Table of Contents
Fetching ...

ADLGen: Synthesizing Symbolic, Event-Triggered Sensor Sequences for Human Activity Modeling

Weihang You, Hanqi Jiang, Zishuai Liu, Zihang Xie, Tianming Liu, Jin Lu, Fei Dou

TL;DR

ADLGen tackles privacy-constrained ADL data by generating realistic, event-triggered symbolic sensor sequences conditioned on activities. It pairs a decoder-only Transformer with sign-based tokenization and symbolic-temporal decoupling, coupled with a context- and floorplan-aware sampler, and an LLM-driven generate-evaluate-refine loop to enforce semantic and temporal coherence. The approach yields superior statistical fidelity, semantic richness, and downstream HAR performance on the CASAS Aruba dataset, and supports few-shot augmentation for rare activities and cross-floorplan transfer. This privacy-preserving data synthesis framework advances practical human activity modeling in ambient-assisted living and enables robust activity recognition without extensive real-world data collection.

Abstract

Real world collection of Activities of Daily Living data is challenging due to privacy concerns, costly deployment and labeling, and the inherent sparsity and imbalance of human behavior. We present ADLGen, a generative framework specifically designed to synthesize realistic, event triggered, and symbolic sensor sequences for ambient assistive environments. ADLGen integrates a decoder only Transformer with sign based symbolic temporal encoding, and a context and layout aware sampling mechanism to guide generation toward semantically rich and physically plausible sensor event sequences. To enhance semantic fidelity and correct structural inconsistencies, we further incorporate a large language model into an automatic generate evaluate refine loop, which verifies logical, behavioral, and temporal coherence and generates correction rules without manual intervention or environment specific tuning. Through comprehensive experiments with novel evaluation metrics, ADLGen is shown to outperform baseline generators in statistical fidelity, semantic richness, and downstream activity recognition, offering a scalable and privacy-preserving solution for ADL data synthesis.

ADLGen: Synthesizing Symbolic, Event-Triggered Sensor Sequences for Human Activity Modeling

TL;DR

ADLGen tackles privacy-constrained ADL data by generating realistic, event-triggered symbolic sensor sequences conditioned on activities. It pairs a decoder-only Transformer with sign-based tokenization and symbolic-temporal decoupling, coupled with a context- and floorplan-aware sampler, and an LLM-driven generate-evaluate-refine loop to enforce semantic and temporal coherence. The approach yields superior statistical fidelity, semantic richness, and downstream HAR performance on the CASAS Aruba dataset, and supports few-shot augmentation for rare activities and cross-floorplan transfer. This privacy-preserving data synthesis framework advances practical human activity modeling in ambient-assisted living and enables robust activity recognition without extensive real-world data collection.

Abstract

Real world collection of Activities of Daily Living data is challenging due to privacy concerns, costly deployment and labeling, and the inherent sparsity and imbalance of human behavior. We present ADLGen, a generative framework specifically designed to synthesize realistic, event triggered, and symbolic sensor sequences for ambient assistive environments. ADLGen integrates a decoder only Transformer with sign based symbolic temporal encoding, and a context and layout aware sampling mechanism to guide generation toward semantically rich and physically plausible sensor event sequences. To enhance semantic fidelity and correct structural inconsistencies, we further incorporate a large language model into an automatic generate evaluate refine loop, which verifies logical, behavioral, and temporal coherence and generates correction rules without manual intervention or environment specific tuning. Through comprehensive experiments with novel evaluation metrics, ADLGen is shown to outperform baseline generators in statistical fidelity, semantic richness, and downstream activity recognition, offering a scalable and privacy-preserving solution for ADL data synthesis.

Paper Structure

This paper contains 59 sections, 26 equations, 13 figures, 8 tables, 2 algorithms.

Figures (13)

  • Figure 1: Semantic Quality evaluation (1-5 scale) across different daily activities.
  • Figure 2: Our Two-Stage Synthetic Activity Data Generation Framework. (a) Pretrain Stage: Transformer learns ADL patterns from sign-based, decoupled encoded sensor data. (b) Inference Stage: Transformer generates sequences which are subsequently refined semantically by a large language model to enhance coherence and contextual relevance.
  • Figure 3: Raw ADL Activity Data (Time: e.g., 2010-11-04,16:10:33.716795)
  • Figure 4: Comparison of encoding strategies
  • Figure 5: LLM evaluation framework with refinement pipeline.
  • ...and 8 more figures