Table of Contents
Fetching ...

GAITGen: Disentangled Motion-Pathology Impaired Gait Generative Model -- Bringing Motion Generation to the Clinical Domain

Vida Adeli, Soroush Mehraban, Majid Mirmehdi, Alan Whone, Benjamin Filtjens, Amirhossein Dadashzadeh, Alfonso Fasano, Andrea Iaboni, Babak Taati

TL;DR

GAITGen addresses data scarcity and bias in clinical gait analysis by disentangling motion dynamics from pathology factors and conditioning gait generation on pathology severity. It combines a conditional RVQ-VAE with two encoders ($_m$, $_p$) and two codebooks, plus a Mask Transformer and a Residual Transformer to produce diverse, pathology-aware gait sequences; the model can mix motion from mild cases with severe pathology latent via an interference weight $\alpha$. The authors introduce the PD-GaM dataset, accompanied by clinically-tailored evaluation metrics, and demonstrate that GAITGen-generated data improves downstream UPDRS-gait severity estimation and generalizes across datasets. A clinical user study confirms realism and clinical relevance of synthetic sequences, supporting potential use in data augmentation for PD gait analysis and broader clinical workflows. Future directions include extending to tremor and dyskinesia, incorporating textual supervision, and exploring personalization with separate subject-level models.

Abstract

Gait analysis is crucial for the diagnosis and monitoring of movement disorders like Parkinson's Disease. While computer vision models have shown potential for objectively evaluating parkinsonian gait, their effectiveness is limited by scarce clinical datasets and the challenge of collecting large and well-labelled data, impacting model accuracy and risk of bias. To address these gaps, we propose GAITGen, a novel framework that generates realistic gait sequences conditioned on specified pathology severity levels. GAITGen employs a Conditional Residual Vector Quantized Variational Autoencoder to learn disentangled representations of motion dynamics and pathology-specific factors, coupled with Mask and Residual Transformers for conditioned sequence generation. GAITGen generates realistic, diverse gait sequences across severity levels, enriching datasets and enabling large-scale model training in parkinsonian gait analysis. Experiments on our new PD-GaM (real) dataset demonstrate that GAITGen outperforms adapted state-of-the-art models in both reconstruction fidelity and generation quality, accurately capturing critical pathology-specific gait features. A clinical user study confirms the realism and clinical relevance of our generated sequences. Moreover, incorporating GAITGen-generated data into downstream tasks improves parkinsonian gait severity estimation, highlighting its potential for advancing clinical gait analysis.

GAITGen: Disentangled Motion-Pathology Impaired Gait Generative Model -- Bringing Motion Generation to the Clinical Domain

TL;DR

GAITGen addresses data scarcity and bias in clinical gait analysis by disentangling motion dynamics from pathology factors and conditioning gait generation on pathology severity. It combines a conditional RVQ-VAE with two encoders (, ) and two codebooks, plus a Mask Transformer and a Residual Transformer to produce diverse, pathology-aware gait sequences; the model can mix motion from mild cases with severe pathology latent via an interference weight . The authors introduce the PD-GaM dataset, accompanied by clinically-tailored evaluation metrics, and demonstrate that GAITGen-generated data improves downstream UPDRS-gait severity estimation and generalizes across datasets. A clinical user study confirms realism and clinical relevance of synthetic sequences, supporting potential use in data augmentation for PD gait analysis and broader clinical workflows. Future directions include extending to tremor and dyskinesia, incorporating textual supervision, and exploring personalization with separate subject-level models.

Abstract

Gait analysis is crucial for the diagnosis and monitoring of movement disorders like Parkinson's Disease. While computer vision models have shown potential for objectively evaluating parkinsonian gait, their effectiveness is limited by scarce clinical datasets and the challenge of collecting large and well-labelled data, impacting model accuracy and risk of bias. To address these gaps, we propose GAITGen, a novel framework that generates realistic gait sequences conditioned on specified pathology severity levels. GAITGen employs a Conditional Residual Vector Quantized Variational Autoencoder to learn disentangled representations of motion dynamics and pathology-specific factors, coupled with Mask and Residual Transformers for conditioned sequence generation. GAITGen generates realistic, diverse gait sequences across severity levels, enriching datasets and enabling large-scale model training in parkinsonian gait analysis. Experiments on our new PD-GaM (real) dataset demonstrate that GAITGen outperforms adapted state-of-the-art models in both reconstruction fidelity and generation quality, accurately capturing critical pathology-specific gait features. A clinical user study confirms the realism and clinical relevance of our generated sequences. Moreover, incorporating GAITGen-generated data into downstream tasks improves parkinsonian gait severity estimation, highlighting its potential for advancing clinical gait analysis.

Paper Structure

This paper contains 38 sections, 15 equations, 17 figures, 10 tables.

Figures (17)

  • Figure 1: GAITGen disentangles motion and pathology into distinct latents, allowing the same motion dynamics to present differently across pathology latents. This disentanglement enables controlled generation of gait sequences given a pathology level.
  • Figure 2: GAITGen architecture. 1) Disentangled Residual VQ-VAE encodes gait sequences into separate motion ($\mathcal{E}_m$) and pathology ($\mathcal{E}_p$) latents, enforced by distinct quantizers and classifiers for disentanglement. 2) Generation - Train: In the first stage of training, masked motion and pathology tokens are generated using the Mask Transformer ($\mathcal{M}$) with separate heads for each latent. In the second stage, the Residual Transformer ($\mathcal{R}$) predicts residuals with a next layer prediction task. 3) During inference, $\mathcal{M}$ predicts masked tokens iteratively, followed by the $\mathcal{R}$, which incrementally refines the representations across multiple quantization layers, conditioned on the pathology level.
  • Figure 3: Mix and Match augmentation. Motion latent codes ($q_m$) from one sample are combined with pathology latent codes ($q_p$) from another to synthesize new gait sequences.
  • Figure 4: Comparison of arm-swing variability across MDS-UPDRS scores for GT and synthetic sequences.
  • Figure 5: User study comparing experts (n=6) scoring of gait sequences with the true labels. True scores indicate (Left)-synthetic subset: condition given to GAITGen for generation, (Right) Real subset: UPDRS-gait score provided by the PD-GaM dataset.
  • ...and 12 more figures