Table of Contents
Fetching ...

Time Series Similarity Score Functions to Monitor and Interact with the Training and Denoising Process of a Time Series Diffusion Model applied to a Human Activity Recognition Dataset based on IMUs

Heiko Oppel, Andreas Spilz, Michael Munz

TL;DR

The paper addresses the challenge of assessing DDPM-generated time-series data quality, which is not well captured by standard loss functions. It proposes PSD-based similarity metrics and a class-optimized global alignment kernel (C-Opt GAK) that are integrated into both the training and denoising phases of an IMU-based time-series diffusion model (IMUDiffusion) to guide early stopping. The study demonstrates that similarity-guided training reduces training epochs by ~20% and can improve downstream HAR classifier performance, with notable gains for several participants in LOSOCV, while denoising-guided stopping offers additional but variable benefits. These findings suggest a practical pathway to more efficient, data-efficient generative modeling for wearable-sensor time series and have potential applicability to broader diffusion-based sequence generation tasks.

Abstract

Denoising diffusion probabilistic models are able to generate synthetic sensor signals. The training process of such a model is controlled by a loss function which measures the difference between the noise that was added in the forward process and the noise that was predicted by the diffusion model. This enables the generation of realistic data. However, the randomness within the process and the loss function itself makes it difficult to estimate the quality of the data. Therefore, we examine multiple similarity metrics and adapt an existing metric to overcome this issue by monitoring the training and synthetisation process using those metrics. The adapted metric can even be fine-tuned on the input data to comply with the requirements of an underlying classification task. We were able to significantly reduce the amount of training epochs without a performance reduction in the classification task. An optimized training process not only saves resources, but also reduces the time for training generative models.

Time Series Similarity Score Functions to Monitor and Interact with the Training and Denoising Process of a Time Series Diffusion Model applied to a Human Activity Recognition Dataset based on IMUs

TL;DR

The paper addresses the challenge of assessing DDPM-generated time-series data quality, which is not well captured by standard loss functions. It proposes PSD-based similarity metrics and a class-optimized global alignment kernel (C-Opt GAK) that are integrated into both the training and denoising phases of an IMU-based time-series diffusion model (IMUDiffusion) to guide early stopping. The study demonstrates that similarity-guided training reduces training epochs by ~20% and can improve downstream HAR classifier performance, with notable gains for several participants in LOSOCV, while denoising-guided stopping offers additional but variable benefits. These findings suggest a practical pathway to more efficient, data-efficient generative modeling for wearable-sensor time series and have potential applicability to broader diffusion-based sequence generation tasks.

Abstract

Denoising diffusion probabilistic models are able to generate synthetic sensor signals. The training process of such a model is controlled by a loss function which measures the difference between the noise that was added in the forward process and the noise that was predicted by the diffusion model. This enables the generation of realistic data. However, the randomness within the process and the loss function itself makes it difficult to estimate the quality of the data. Therefore, we examine multiple similarity metrics and adapt an existing metric to overcome this issue by monitoring the training and synthetisation process using those metrics. The adapted metric can even be fine-tuned on the input data to comply with the requirements of an underlying classification task. We were able to significantly reduce the amount of training epochs without a performance reduction in the classification task. An optimized training process not only saves resources, but also reduces the time for training generative models.

Paper Structure

This paper contains 21 sections, 10 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Influence of the scaling factor $\sigma$ on the cost function $D$.
  • Figure 2: Visual identification of the optimal sigma value for a given train and validation set: (a) for the Cycling activity performed by the participant with the ID 2; (b) for the Walking activity performed by the participant with the ID 1; (c) a summary over all participants by activity including the average and standard deviation of the similarity scores across all participants.
  • Figure 3: Visual comparison between a sequence from the training and validation set. The sequence from the training set was randomly chosen and the sequence from validation set was chosen based on the similarity metric. It is the sequence that resembles the training sequence the most according to the respective similarity metric. The sequence from the training set is visualized in red. The respective sequence from the validation set is visualized in a different color depending on the similarity metric that led to choosing the respective sequence. The top row visualizes the sequences in the time domain and the bottom row their respective power spectral density.
  • Figure 4: Amount of training epochs until the similarity score induced an early stopping of the training process. Subgraph (a) visualizes the participant and class individual result when using the C-Opt GAK similarity metric for early stopping. Subgraph (b) highlights the findings across all evaluated similarity metrics in a swarm-box-plot separated by the classes. Each black dot in the swarm-plot represents one participant.
  • Figure 5: This graph visualizes the similarity scores for specific denoising steps in the denoising process for a single participant (PID 2). Subgraph (a) visualizes the C-Opt GAK score value whereas subgraph (b) and (c) visualizes the Cosine similarity score once between the signals in the time domain once between their PSDs. It is further separated by the four activities Walking, Running, Jump Up and Cycling.
  • ...and 1 more figures