Table of Contents
Fetching ...

Diverse Intra- and Inter-Domain Activity Style Fusion for Cross-Person Generalization in Activity Recognition

Junru Zhang, Lang Feng, Zhidan Liu, Yuhan Wu, Yang He, Yabo Dong, Duanqing Xu

TL;DR

This paper tackles cross-person HAR generalization by addressing limited intra- and inter-domain style diversity in source data. It proposes domain padding, a diffusion-based approach that uses a learned activity style conditioner and a style-fused sampling strategy to generate diverse, class-preserving samples across and within domains. The method, DI2SDiff, leverages classifier-free guidance and random style combinations to expand the domain coverage, yielding state-of-the-art generalization on DSADS, PAMAP2, and USC-HAD with small training sets. Empirical results show substantial improvements over strong DG baselines, particularly on USC-HAD where sub-domain structure is challenging, and demonstrate that the generated data can boost other DG methods as a plug-in augmentation. The approach offers a data-efficient path to robust HAR systems, reducing the need for extensive data collection on edge devices while enabling better deployment on unseen users.

Abstract

Existing domain generalization (DG) methods for cross-person generalization tasks often face challenges in capturing intra- and inter-domain style diversity, resulting in domain gaps with the target domain. In this study, we explore a novel perspective to tackle this problem, a process conceptualized as domain padding. This proposal aims to enrich the domain diversity by synthesizing intra- and inter-domain style data while maintaining robustness to class labels. We instantiate this concept using a conditional diffusion model and introduce a style-fused sampling strategy to enhance data generation diversity. In contrast to traditional condition-guided sampling, our style-fused sampling strategy allows for the flexible use of one or more random styles to guide data synthesis. This feature presents a notable advancement: it allows for the maximum utilization of possible permutations and combinations among existing styles to generate a broad spectrum of new style instances. Empirical evaluations on a broad range of datasets demonstrate that our generated data achieves remarkable diversity within the domain space. Both intra- and inter-domain generated data have proven to be significant and valuable, contributing to varying degrees of performance enhancements. Notably, our approach outperforms state-of-the-art DG methods in all human activity recognition tasks.

Diverse Intra- and Inter-Domain Activity Style Fusion for Cross-Person Generalization in Activity Recognition

TL;DR

This paper tackles cross-person HAR generalization by addressing limited intra- and inter-domain style diversity in source data. It proposes domain padding, a diffusion-based approach that uses a learned activity style conditioner and a style-fused sampling strategy to generate diverse, class-preserving samples across and within domains. The method, DI2SDiff, leverages classifier-free guidance and random style combinations to expand the domain coverage, yielding state-of-the-art generalization on DSADS, PAMAP2, and USC-HAD with small training sets. Empirical results show substantial improvements over strong DG baselines, particularly on USC-HAD where sub-domain structure is challenging, and demonstrate that the generated data can boost other DG methods as a plug-in augmentation. The approach offers a data-efficient path to robust HAR systems, reducing the need for extensive data collection on edge devices while enabling better deployment on unseen users.

Abstract

Existing domain generalization (DG) methods for cross-person generalization tasks often face challenges in capturing intra- and inter-domain style diversity, resulting in domain gaps with the target domain. In this study, we explore a novel perspective to tackle this problem, a process conceptualized as domain padding. This proposal aims to enrich the domain diversity by synthesizing intra- and inter-domain style data while maintaining robustness to class labels. We instantiate this concept using a conditional diffusion model and introduce a style-fused sampling strategy to enhance data generation diversity. In contrast to traditional condition-guided sampling, our style-fused sampling strategy allows for the flexible use of one or more random styles to guide data synthesis. This feature presents a notable advancement: it allows for the maximum utilization of possible permutations and combinations among existing styles to generate a broad spectrum of new style instances. Empirical evaluations on a broad range of datasets demonstrate that our generated data achieves remarkable diversity within the domain space. Both intra- and inter-domain generated data have proven to be significant and valuable, contributing to varying degrees of performance enhancements. Notably, our approach outperforms state-of-the-art DG methods in all human activity recognition tasks.
Paper Structure (32 sections, 14 equations, 11 figures, 4 tables, 2 algorithms)

This paper contains 32 sections, 14 equations, 11 figures, 4 tables, 2 algorithms.

Figures (11)

  • Figure 1: T-SNE visualization of time-series features extracted by various methods across three domains in HAR. Existing representation learning methods result in domain gaps as in both (a) and (b), covering a small portion of target domain (red circles). Standard data augmentation (DA) leads to augmented data (stars), with source domains (orange/blue circles) remaining in close proximity to each other and failing to fill gaps. Our method (d) creates a comprehensive feature space by padding domain gaps via the idea of (e).
  • Figure 2: Illustration of the diffusion within DI2SDiff. It contains a style conditioner to produce styles and a conditional diffusion for data generation. Suppose we have three original walking samples: $\mathbf{X}_1$, $\mathbf{X}_2$, and $\mathbf{X}_3$, where $\mathbf{X}_1$ is from a different domain while $\mathbf{X}_2$ and $\mathbf{X}_3$ come from the same domain. (a) The style conditioner generates style features from the original data. The style features are randomly combined to build the condition space, in which the combination of inter-domain styles is indicated by blue brackets and the combination of intra-domain styles is indicated by grey brackets. (b) During training, the diffusion retrieves each data sample with one style for the forward process. (c) During sampling, the diffusion receives noise and a style combination, e.g., $[S_1, S_3]$, for the reverse process. (d) The generated sample $\tilde{\mathbf{X}}_i$ is used to diversify the data space.
  • Figure 4: T-SNE visualization of DSADS, PAMAP2, and USC-HAR datasets. Each method generates the same amount of synthetic data. Each domain category is represented by a color, and the target domain is represented by a red dot. The original and synthetic data are represented by shapes dots and crosses, respectively. Best viewed in color and zoom in.
  • Figure 6: Enhancing the performance of DANN ganin2016domain, mDSDI bui2021exploiting and DDLearn qin2023generalizable with our data generation (+) on 20% and 100% training data in three datasets.
  • Figure : (a) DSADS
  • ...and 6 more figures