Latent Style-based Quantum GAN for high-quality Image Generation
Su Yeon Chang, Supanut Thanasilp, Bertrand Le Saux, Sofia Vallecorsa, Michele Grossi
TL;DR
LaSt-QGAN presents a hybrid classical-quantum framework that enables large-scale image generation by training a style-based quantum generator in a latent space formed by a pre-trained autoencoder. The approach leverages a Wasserstein GAN with gradient penalty to train a quantum generator to produce latent features that, when decoded, yield high-quality images on MNIST, FashionMNIST, and SAT4 with about 10 qubits. Empirically, LaSt-QGAN achieves competitive or superior FID and JSD metrics relative to a classical GAN with similar parameter counts and demonstrates faster convergence and robustness to shot noise, aided by autoencoder postprocessing. The work also analyzes training dynamics, showing that small-angle initializations can mitigate barren plateaus in polynomial-depth circuits, and discusses warm-start strategies and broader implications for continuous quantum generative modeling.
Abstract
Quantum generative modeling is among the promising candidates for achieving a practical advantage in data analysis. Nevertheless, one key challenge is to generate large-size images comparable to those generated by their classical counterparts. In this work, we take an initial step in this direction and introduce the Latent Style-based Quantum GAN (LaSt-QGAN), which employs a hybrid classical-quantum approach in training Generative Adversarial Networks (GANs) for arbitrary complex data generation. This novel approach relies on powerful classical auto-encoders to map a high-dimensional original image dataset into a latent representation. The hybrid classical-quantum GAN operates in this latent space to generate an arbitrary number of fake features, which are then passed back to the auto-encoder to reconstruct the original data. Our LaSt-QGAN can be successfully trained on realistic computer vision datasets beyond the standard MNIST, namely Fashion MNIST (fashion products) and SAT4 (Earth Observation images) with 10 qubits, resulting in a comparable performance (and even better in some metrics) with the classical GANs. Moreover, we analyze the barren plateau phenomena within this context of the continuous quantum generative model using a polynomial depth circuit and propose a method to mitigate the detrimental effect during the training of deep-depth networks. Through empirical experiments and theoretical analysis, we demonstrate the potential of LaSt-QGAN for the practical usage in the context of image generation and open the possibility of applying it to a larger dataset in the future.
