Table of Contents
Fetching ...

An Integrated Approach to AI-Generated Content in e-health

Tasnim Ahmed, Salimur Choudhury

TL;DR

This work tackles data scarcity in e-health by generating synthetic images and text conditioned on disease classes. It proposes a class-conditioned diffusion model for images (ContextUnet) and an uncensored Llama-3.1-8B–based pipeline for text, evaluated through a fourfold data configuration (Real, Composite, Synthetic, SMOTE-like). Results show diffusion-based images outperform GAN-based approaches, and uncensored text better captures real-world data patterns, boosting downstream classifier performance while highlighting ethical risks. The framework offers a practical path to augmenting scarce clinical data, but necessitates safeguards and qualitative clinician validation to ensure safe and responsible deployment.

Abstract

Artificial Intelligence-Generated Content, a subset of Generative Artificial Intelligence, holds significant potential for advancing the e-health sector by generating diverse forms of data. In this paper, we propose an end-to-end class-conditioned framework that addresses the challenge of data scarcity in health applications by generating synthetic medical images and text data, evaluating on practical applications such as retinopathy detection, skin infections and mental health assessments. Our framework integrates Diffusion and Large Language Models (LLMs) to generate data that closely match real-world patterns, which is essential for improving downstream task performance and model robustness in e-health applications. Experimental results demonstrate that the synthetic images produced by the proposed diffusion model outperform traditional GAN architectures. Similarly, in the text modality, data generated by uncensored LLM achieves significantly better alignment with real-world data than censored models in replicating the authentic tone.

An Integrated Approach to AI-Generated Content in e-health

TL;DR

This work tackles data scarcity in e-health by generating synthetic images and text conditioned on disease classes. It proposes a class-conditioned diffusion model for images (ContextUnet) and an uncensored Llama-3.1-8B–based pipeline for text, evaluated through a fourfold data configuration (Real, Composite, Synthetic, SMOTE-like). Results show diffusion-based images outperform GAN-based approaches, and uncensored text better captures real-world data patterns, boosting downstream classifier performance while highlighting ethical risks. The framework offers a practical path to augmenting scarce clinical data, but necessitates safeguards and qualitative clinician validation to ensure safe and responsible deployment.

Abstract

Artificial Intelligence-Generated Content, a subset of Generative Artificial Intelligence, holds significant potential for advancing the e-health sector by generating diverse forms of data. In this paper, we propose an end-to-end class-conditioned framework that addresses the challenge of data scarcity in health applications by generating synthetic medical images and text data, evaluating on practical applications such as retinopathy detection, skin infections and mental health assessments. Our framework integrates Diffusion and Large Language Models (LLMs) to generate data that closely match real-world patterns, which is essential for improving downstream task performance and model robustness in e-health applications. Experimental results demonstrate that the synthetic images produced by the proposed diffusion model outperform traditional GAN architectures. Similarly, in the text modality, data generated by uncensored LLM achieves significantly better alignment with real-world data than censored models in replicating the authentic tone.

Paper Structure

This paper contains 17 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Proposed framework for training and evaluation of AI-generated e-health data. This framework generates both image and text data that approximates original samples, although the text modality does not require task-specific training. The generated data can be applied in diverse e-health contexts, including diagnosis, monitoring, personalized treatment, and remote care.
  • Figure 2: Class-conditioned diffusion model-generated synthetic diabetic retinopathy images (grayscale). The columns from left to right correspond to class labels: No_DR, Mild, Moderate, Severe, and Proliferate_DR. The first two rows display synthetic images generated by the model, while the last two rows contain real images for comparison.
  • Figure 3: Class-conditioned diffusion model-generated synthetic images (grayscale) related to 21 dermatological conditions. The columns from left to right correspond to class labels. The first two rows display synthetic images generated by the model, while the last two rows contain real images for comparison.
  • Figure 4: Training metrics over steps for the diabetic retinopathy (left) and the dermnet (right) dataset. The normalized scores of FID, SSIM, and PSNR are shown. Each step corresponds to 50 epochs.