An Integrated Approach to AI-Generated Content in e-health
Tasnim Ahmed, Salimur Choudhury
TL;DR
This work tackles data scarcity in e-health by generating synthetic images and text conditioned on disease classes. It proposes a class-conditioned diffusion model for images (ContextUnet) and an uncensored Llama-3.1-8B–based pipeline for text, evaluated through a fourfold data configuration (Real, Composite, Synthetic, SMOTE-like). Results show diffusion-based images outperform GAN-based approaches, and uncensored text better captures real-world data patterns, boosting downstream classifier performance while highlighting ethical risks. The framework offers a practical path to augmenting scarce clinical data, but necessitates safeguards and qualitative clinician validation to ensure safe and responsible deployment.
Abstract
Artificial Intelligence-Generated Content, a subset of Generative Artificial Intelligence, holds significant potential for advancing the e-health sector by generating diverse forms of data. In this paper, we propose an end-to-end class-conditioned framework that addresses the challenge of data scarcity in health applications by generating synthetic medical images and text data, evaluating on practical applications such as retinopathy detection, skin infections and mental health assessments. Our framework integrates Diffusion and Large Language Models (LLMs) to generate data that closely match real-world patterns, which is essential for improving downstream task performance and model robustness in e-health applications. Experimental results demonstrate that the synthetic images produced by the proposed diffusion model outperform traditional GAN architectures. Similarly, in the text modality, data generated by uncensored LLM achieves significantly better alignment with real-world data than censored models in replicating the authentic tone.
