SHADE-AD: An LLM-Based Framework for Synthesizing Activity Data of Alzheimer's Patients
Heming Fu, Hongkai Chen, Shan Lin, Guoliang Xing
TL;DR
The paper introduces SHADE-AD, an LLM-based framework that synthesizes Alzheimer’s Disease (AD)-specific activity data by embedding AD knowledge into a three-stage training pipeline, addressing the scarcity of AD behavioral datasets. It combines a text encoder, a video encoder, and a diffusion-based generator to produce skeleton-based activity videos conditioned on textual prompts, with stage-wise domain adaptation and a motion-metric loss to align synthetic data with real patient motion in 12 joints. Motion metrics validate realism, and preliminary HAR experiments show substantial performance gains, with up to 79.69% improvement in targeted actions and successful cross-modality transfer to depth videos. The framework offers privacy-preserving, cost-effective data generation that can significantly enhance smart health applications for AD monitoring, with potential for broader adoption in disease-specific HAR tasks.
Abstract
Alzheimer's Disease (AD) has become an increasingly critical global health concern, which necessitates effective monitoring solutions in smart health applications. However, the development of such solutions is significantly hindered by the scarcity of AD-specific activity datasets. To address this challenge, we propose SHADE-AD, a Large Language Model (LLM) framework for Synthesizing Human Activity Datasets Embedded with AD features. Leveraging both public datasets and our own collected data from 99 AD patients, SHADE-AD synthesizes human activity videos that specifically represent AD-related behaviors. By employing a three-stage training mechanism, it broadens the range of activities beyond those collected from limited deployment settings. We conducted comprehensive evaluations of the generated dataset, demonstrating significant improvements in downstream tasks such as Human Activity Recognition (HAR) detection, with enhancements of up to 79.69%. Detailed motion metrics between real and synthetic data show strong alignment, validating the realism and utility of the synthesized dataset. These results underscore SHADE-AD's potential to advance smart health applications by providing a cost-effective, privacy-preserving solution for AD monitoring.
