Table of Contents
Fetching ...

Challenges and Limitations of Generative AI in Synthesizing Wearable Sensor Data

Flavio Di Martino, Franca Delmastro

TL;DR

This work scrutinizes state-of-the-art generative AI approaches for wearable sensor time series, focusing on multimodal, long-range, and conditional data generation. It develops a modality- and task-aware evaluation framework to jointly assess intrinsic fidelity and downstream predictive utility, and applies it to GAN and diffusion models on real-world datasets. Key findings show substantial cross-modal and temporal coherence gaps across models, with BioDiffusion providing the most consistent quality yet with only modest gains in data augmentation. The study highlights the need for standardized TS evaluation metrics and outlines future directions toward scalable, personalized, and semantically conditioned synthetic wearable data generation.

Abstract

The widespread adoption of wearable sensors has the potential to provide massive and heterogeneous time series data, driving the use of Artificial Intelligence in human sensing applications. However, data collection remains limited due to stringent ethical regulations, privacy concerns, and other constraints, hindering progress in the field. Synthetic data generation, particularly through Generative Adversarial Networks and Diffusion Models, has emerged as a promising solution to mitigate both data scarcity and privacy issues. However, these models are often limited to narrow operational scenarios, such as short-term and unimodal signal patterns. To address this gap, we present a systematic evaluation of state-of-the-art generative models for time series data, explicitly assessing their performance in challenging scenarios such as stress and emotion recognition. Our study examines the extent to which these models can jointly handle multi-modality, capture long-range dependencies, and support conditional generation-core requirements for real-world wearable sensor data generation. To enable a fair and rigorous comparison, we also introduce an evaluation framework that evaluates both the intrinsic fidelity of the generated data and their utility in downstream predictive tasks. Our findings reveal critical limitations in the existing approaches, particularly in maintaining cross-modal consistency, preserving temporal coherence, and ensuring robust performance in train-on-synthetic, test-on-real, and data augmentation scenarios. Finally, we present our future research directions to enhance synthetic time series generation and improve the applicability of generative models in the wearable computing domain.

Challenges and Limitations of Generative AI in Synthesizing Wearable Sensor Data

TL;DR

This work scrutinizes state-of-the-art generative AI approaches for wearable sensor time series, focusing on multimodal, long-range, and conditional data generation. It develops a modality- and task-aware evaluation framework to jointly assess intrinsic fidelity and downstream predictive utility, and applies it to GAN and diffusion models on real-world datasets. Key findings show substantial cross-modal and temporal coherence gaps across models, with BioDiffusion providing the most consistent quality yet with only modest gains in data augmentation. The study highlights the need for standardized TS evaluation metrics and outlines future directions toward scalable, personalized, and semantically conditioned synthetic wearable data generation.

Abstract

The widespread adoption of wearable sensors has the potential to provide massive and heterogeneous time series data, driving the use of Artificial Intelligence in human sensing applications. However, data collection remains limited due to stringent ethical regulations, privacy concerns, and other constraints, hindering progress in the field. Synthetic data generation, particularly through Generative Adversarial Networks and Diffusion Models, has emerged as a promising solution to mitigate both data scarcity and privacy issues. However, these models are often limited to narrow operational scenarios, such as short-term and unimodal signal patterns. To address this gap, we present a systematic evaluation of state-of-the-art generative models for time series data, explicitly assessing their performance in challenging scenarios such as stress and emotion recognition. Our study examines the extent to which these models can jointly handle multi-modality, capture long-range dependencies, and support conditional generation-core requirements for real-world wearable sensor data generation. To enable a fair and rigorous comparison, we also introduce an evaluation framework that evaluates both the intrinsic fidelity of the generated data and their utility in downstream predictive tasks. Our findings reveal critical limitations in the existing approaches, particularly in maintaining cross-modal consistency, preserving temporal coherence, and ensuring robust performance in train-on-synthetic, test-on-real, and data augmentation scenarios. Finally, we present our future research directions to enhance synthetic time series generation and improve the applicability of generative models in the wearable computing domain.

Paper Structure

This paper contains 21 sections, 1 figure, 8 tables.

Figures (1)

  • Figure 1: Signal- and class-specific t-SNE visualizations of real vs. synthetic distributions for the top-performing BioDiffusion model for each dataset.