MediQ-GAN: Quantum-Inspired GAN for High Resolution Medical Image Generation

Qingyue Jiao; Yongcan Tang; Jun Zhuang; Jason Cong; Yiyu Shi

MediQ-GAN: Quantum-Inspired GAN for High Resolution Medical Image Generation

Qingyue Jiao, Yongcan Tang, Jun Zhuang, Jason Cong, Yiyu Shi

TL;DR

MediQ-GAN is presented, a quantum-inspired GAN with prototype-guided skip connections and a dual-stream generator that fuses classical and quantum-inspired branches that outperforms state-of-the-art GANs and diffusion models.

Abstract

Machine learning-assisted diagnosis shows promise, yet medical imaging datasets are often scarce, imbalanced, and constrained by privacy, making data augmentation essential. Classical generative models typically demand extensive computational and sample resources. Quantum computing offers a promising alternative, but existing quantum-based image generation methods remain limited in scale and often face barren plateaus. We present MediQ-GAN, a quantum-inspired GAN with prototype-guided skip connections and a dual-stream generator that fuses classical and quantum-inspired branches. Its variational quantum circuits inherently preserve full-rank mappings, avoid rank collapse, and are theory-guided to balance expressivity with trainability. Beyond generation quality, we provide the first latent-geometry and rank-based analysis of quantum-inspired GANs, offering theoretical insight into their performance. Across three medical imaging datasets, MediQ-GAN outperforms state-of-the-art GANs and diffusion models. While validated on IBM hardware for robustness, our contribution is hardware-agnostic, offering a scalable and data-efficient framework for medical image generation and augmentation.

MediQ-GAN: Quantum-Inspired GAN for High Resolution Medical Image Generation

TL;DR

Abstract

Paper Structure

This paper contains 26 sections, 10 equations, 7 figures, 7 tables.

Figures (7)

Figure 1: MediQ-GAN architecture.(a) Full GAN pipeline. In MediQ-GAN, the latent noise is sampled from Gaussian noise and passed to a simple classical neural network. Half of the classical intermediate features are encoded into quantum features and processed by variational quantum circuits. The output from the quantum circuits is concatenated with the rest of the classical intermediate features, followed by a convolutional layer to produce the color image. The critic is a classical deep convolutional network and acts as a quality inspector, labeling input images as real or fake. (b) Generator architecture. The input $z$ (latent noise plus prototypes) is encoded into a feature map $F\in\mathbb{R}^{32\times4\times4}$, split into classical and quantum streams. Quantum features are processed by five variational quantum circuits (16 qubits each), then fused with classical features and decoded into a $3\times64\times64$ RGB image.
Figure 1: PQC Trainability versus DLA Dimension Scaling. This figure compares the scaling behavior of the DLA dimension for different PQC architectures. (Dashed line) Globally-entangled PQC: The dimension grows exponentially ($\sim 4^n$), leading to the barren plateau problem. (Solid line) Linear-chain PQC (This work): The dimension grows polynomially ($O(n^3)$), effectively avoiding barren plateaus and ensuring trainability.
Figure 2: Random synthetic samples from representative classes of ISIC 2019, ODIR-5k, and RetinaMNIST by baseline generative models and our model, MediQ-GAN.
Figure 2: Light-cone growth and optimal depth in a 1D nearest-neighbor PQC. This figure illustrates the causal light-cone expansion in a PQC with $n=16$ qubits and $E=1$. To achieve full causal connectivity, information from the first qubit ($q_0$) must be able to reach the last ($q_{15}$), requiring a minimum depth of $L_{\mathrm{opt}} = 8$.
Figure 3: Variational quantum circuit ansatz design for quantum generators. Each input component $x_j$ is encoded with angle encoding to qubit $j$ using $R_x(x_j)$ followed by $R_y(x_j)$. The encoded state is processed by $L$ identical trainable layers ($L=8$ for optimal result), each applying a learnable single-qubit rotation $R_y(\theta_{\ell,j})$ and a nearest-neighbour CNOT ($1\!\to\!2\!\to\!\cdots\!\to\!N$). All qubits are finally read out in the $X$ basis and produce the classical vector $\mathbf{m}(x;\Theta)=(\langle X_1\rangle,\ldots,\langle X_N\rangle)\in[-1,1]^N$. The “$\times 8$” indicates repetition layers, and the wavy gap indicates middle wires with the same gates.
...and 2 more figures