Table of Contents
Fetching ...

Latent Style-based Quantum GAN for high-quality Image Generation

Su Yeon Chang, Supanut Thanasilp, Bertrand Le Saux, Sofia Vallecorsa, Michele Grossi

TL;DR

LaSt-QGAN presents a hybrid classical-quantum framework that enables large-scale image generation by training a style-based quantum generator in a latent space formed by a pre-trained autoencoder. The approach leverages a Wasserstein GAN with gradient penalty to train a quantum generator to produce latent features that, when decoded, yield high-quality images on MNIST, FashionMNIST, and SAT4 with about 10 qubits. Empirically, LaSt-QGAN achieves competitive or superior FID and JSD metrics relative to a classical GAN with similar parameter counts and demonstrates faster convergence and robustness to shot noise, aided by autoencoder postprocessing. The work also analyzes training dynamics, showing that small-angle initializations can mitigate barren plateaus in polynomial-depth circuits, and discusses warm-start strategies and broader implications for continuous quantum generative modeling.

Abstract

Quantum generative modeling is among the promising candidates for achieving a practical advantage in data analysis. Nevertheless, one key challenge is to generate large-size images comparable to those generated by their classical counterparts. In this work, we take an initial step in this direction and introduce the Latent Style-based Quantum GAN (LaSt-QGAN), which employs a hybrid classical-quantum approach in training Generative Adversarial Networks (GANs) for arbitrary complex data generation. This novel approach relies on powerful classical auto-encoders to map a high-dimensional original image dataset into a latent representation. The hybrid classical-quantum GAN operates in this latent space to generate an arbitrary number of fake features, which are then passed back to the auto-encoder to reconstruct the original data. Our LaSt-QGAN can be successfully trained on realistic computer vision datasets beyond the standard MNIST, namely Fashion MNIST (fashion products) and SAT4 (Earth Observation images) with 10 qubits, resulting in a comparable performance (and even better in some metrics) with the classical GANs. Moreover, we analyze the barren plateau phenomena within this context of the continuous quantum generative model using a polynomial depth circuit and propose a method to mitigate the detrimental effect during the training of deep-depth networks. Through empirical experiments and theoretical analysis, we demonstrate the potential of LaSt-QGAN for the practical usage in the context of image generation and open the possibility of applying it to a larger dataset in the future.

Latent Style-based Quantum GAN for high-quality Image Generation

TL;DR

LaSt-QGAN presents a hybrid classical-quantum framework that enables large-scale image generation by training a style-based quantum generator in a latent space formed by a pre-trained autoencoder. The approach leverages a Wasserstein GAN with gradient penalty to train a quantum generator to produce latent features that, when decoded, yield high-quality images on MNIST, FashionMNIST, and SAT4 with about 10 qubits. Empirically, LaSt-QGAN achieves competitive or superior FID and JSD metrics relative to a classical GAN with similar parameter counts and demonstrates faster convergence and robustness to shot noise, aided by autoencoder postprocessing. The work also analyzes training dynamics, showing that small-angle initializations can mitigate barren plateaus in polynomial-depth circuits, and discusses warm-start strategies and broader implications for continuous quantum generative modeling.

Abstract

Quantum generative modeling is among the promising candidates for achieving a practical advantage in data analysis. Nevertheless, one key challenge is to generate large-size images comparable to those generated by their classical counterparts. In this work, we take an initial step in this direction and introduce the Latent Style-based Quantum GAN (LaSt-QGAN), which employs a hybrid classical-quantum approach in training Generative Adversarial Networks (GANs) for arbitrary complex data generation. This novel approach relies on powerful classical auto-encoders to map a high-dimensional original image dataset into a latent representation. The hybrid classical-quantum GAN operates in this latent space to generate an arbitrary number of fake features, which are then passed back to the auto-encoder to reconstruct the original data. Our LaSt-QGAN can be successfully trained on realistic computer vision datasets beyond the standard MNIST, namely Fashion MNIST (fashion products) and SAT4 (Earth Observation images) with 10 qubits, resulting in a comparable performance (and even better in some metrics) with the classical GANs. Moreover, we analyze the barren plateau phenomena within this context of the continuous quantum generative model using a polynomial depth circuit and propose a method to mitigate the detrimental effect during the training of deep-depth networks. Through empirical experiments and theoretical analysis, we demonstrate the potential of LaSt-QGAN for the practical usage in the context of image generation and open the possibility of applying it to a larger dataset in the future.
Paper Structure (23 sections, 67 equations, 22 figures, 5 tables)

This paper contains 23 sections, 67 equations, 22 figures, 5 tables.

Figures (22)

  • Figure 1: Schematic diagram summarizing the general training framework of discrete and continuous quantum GANs. Frequently, we use a hybrid approach with a quantum generator and classical discriminator Zoufal2019BravoPrieto2022, although an alternative option exists where a quantum discriminator is employed Romero2019.
  • Figure 2: Schematic diagram for LaSt-QGAN training. The model consists of a convolutional auto-encoder that embeds the original images into a low-dimensional latent space and a quantum GAN with a quantum generator $G_{\bm{\theta}}$ and a classical discriminator $D_{\bm{\phi}}$. The features extracted with the autoencoder are used as the training set of the GAN. At the end of the training, images are reconstructed by inversely transforming the features generated by the quantum generator using the pre-trained convolutional auto-encoder.
  • Figure 3: Different circuit architecture used for learning layers, $U^\ell_{\bm{\theta}}$, in the quantum generator. (a) Circuit1 and (b) Circuit2 are taken from two different quantum GAN papers for continuous data generation by C. Bravo Prieto et al.BravoPrieto2022 and J. Romero et al.Romero2019, respectively. (c) Circuit3 is composed of repeated two-qubit quantum circuits (blue square), responsible for an arbitrary $SU(4)$ state generation MacCormack2020SU4.
  • Figure 4: Examples of images generated via LaSt-QGAN (Circuit1, depth 2) and a classical GAN ([50, 30]) for different datasets: MNIST, FashionMNIST and SAT4. The fake features are obtained using $\mathcal{D}_{\mathbf{z}} = 10$ and $\mathcal{D}_{\ell} = 20$ and the images are reconstructed using a pre-trained convolutional auto-encoder from the features obtained by the GAN in the latent space. The images are presented in columns classified using a pre-trained ResNet50 He2015resnet for MNIST and FashionMNIST, and in rows for SAT4.
  • Figure 5: Visualization of generated features embedded into two dimensions using t-SNE for MNIST and FashionMNIST dataset. The labels of generated samples are obtained via classification with pre-trained ResNet50. The clustering of features reveals that the underlying similarity in each class is preserved in the latent space with the proposed models.
  • ...and 17 more figures