Table of Contents
Fetching ...

Quantum Generative Models for Image Generation: Insights from MNIST and MedMNIST

Chi-Sheng Chen, Wei An Hou, Hsiang-Wei Hu, Zhen-Sheng Cai

TL;DR

The paper investigates whether integrating a variational quantum circuit into a diffusion-based image generator can outperform identical classical architectures. It introduces a hybrid quantum-classical diffusion model with a PQC inserted in the bottleneck of a lightweight U-Net, evaluated on MNIST and MedMNIST (PathMNIST) using a PennyLane-based differentiable QNode. Across experiments, the quantum model shows benefits in low-data and grayscale settings, achieving higher perceptual quality (SSIM) and closer data distribution (lower FID) than the classical baseline on MNIST, while benefits on color, higher-variance datasets like PathMNIST are mixed, with the quantum model achieving better FID but worse SSIM. These results demonstrate feasibility and promise of quantum diffusion for low-resource and biomedical image generation while highlighting current limitations and avenues for more expressive quantum architectures and hardware validation.

Abstract

Quantum generative models offer a promising new direction in machine learning by leveraging quantum circuits to enhance data generation capabilities. In this study, we propose a hybrid quantum-classical image generation framework that integrates variational quantum circuits into a diffusion-based model. To improve training dynamics and generation quality, we introduce two novel noise strategies: intrinsic quantum-generated noise and a tailored noise scheduling mechanism. Our method is built upon a lightweight U-Net architecture, with the quantum layer embedded in the bottleneck module to isolate its effect. We evaluate our model on MNIST and MedMNIST datasets to examine its feasibility and performance. Notably, our results reveal that under limited data conditions (fewer than 100 training images), the quantum-enhanced model generates images with higher perceptual quality and distributional similarity than its classical counterpart using the same architecture. While the quantum model shows advantages on grayscale data such as MNIST, its performance is more nuanced on complex, color-rich datasets like PathMNIST. These findings highlight both the potential and current limitations of quantum generative models and lay the groundwork for future developments in low-resource and biomedical image generation.

Quantum Generative Models for Image Generation: Insights from MNIST and MedMNIST

TL;DR

The paper investigates whether integrating a variational quantum circuit into a diffusion-based image generator can outperform identical classical architectures. It introduces a hybrid quantum-classical diffusion model with a PQC inserted in the bottleneck of a lightweight U-Net, evaluated on MNIST and MedMNIST (PathMNIST) using a PennyLane-based differentiable QNode. Across experiments, the quantum model shows benefits in low-data and grayscale settings, achieving higher perceptual quality (SSIM) and closer data distribution (lower FID) than the classical baseline on MNIST, while benefits on color, higher-variance datasets like PathMNIST are mixed, with the quantum model achieving better FID but worse SSIM. These results demonstrate feasibility and promise of quantum diffusion for low-resource and biomedical image generation while highlighting current limitations and avenues for more expressive quantum architectures and hardware validation.

Abstract

Quantum generative models offer a promising new direction in machine learning by leveraging quantum circuits to enhance data generation capabilities. In this study, we propose a hybrid quantum-classical image generation framework that integrates variational quantum circuits into a diffusion-based model. To improve training dynamics and generation quality, we introduce two novel noise strategies: intrinsic quantum-generated noise and a tailored noise scheduling mechanism. Our method is built upon a lightweight U-Net architecture, with the quantum layer embedded in the bottleneck module to isolate its effect. We evaluate our model on MNIST and MedMNIST datasets to examine its feasibility and performance. Notably, our results reveal that under limited data conditions (fewer than 100 training images), the quantum-enhanced model generates images with higher perceptual quality and distributional similarity than its classical counterpart using the same architecture. While the quantum model shows advantages on grayscale data such as MNIST, its performance is more nuanced on complex, color-rich datasets like PathMNIST. These findings highlight both the potential and current limitations of quantum generative models and lay the groundwork for future developments in low-resource and biomedical image generation.

Paper Structure

This paper contains 14 sections, 13 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Model Architecture of Quantum Diffusion Model in this work.
  • Figure 2: The VQC (quantum layer) used in this work.
  • Figure 3: Comparison between classical and quantum diffusion models on MNIST digit 0.
  • Figure 4: Comparison between classical and quantum diffusion models on MNIST digit 1.
  • Figure 5: Comparison between classical and quantum diffusion models on MNIST digit 6.
  • ...and 2 more figures