Quantum Down Sampling Filter for Variational Auto-encoder
Farina Riaz, Fakhar Zaman, Hajime Suzuki, Sharif Abuadbba, David Nguyen
TL;DR
This work tackles the challenge of achieving high-fidelity reconstructions in variational autoencoders by introducing a quantum downsampling filter that embeds quantum encoding exclusively in the encoder (Q-VAE). The approach preserves a classical decoder architecture and compares three encoders—C-VAE, CDP-VAE, and Q-VAE—on MNIST and USPS, with Q-VAE yielding lower FID and MSE, indicating more faithful reconstructions. Notably, Q-VAE achieves these gains without adding trainable parameters, while CDP-VAE demonstrates parameter efficiency relative to C-VAE; together these results highlight the potential of quantum encoding to improve latent representations and generative quality in lightweight settings. The findings suggest practical impact for high-quality data synthesis and image reconstruction, though scalability to larger, higher-resolution data and quantum hardware remains a critical area for future work.
Abstract
Variational autoencoders (VAEs) are fundamental for generative modeling and image reconstruction, yet their performance often struggles to maintain high fidelity in reconstructions. This study introduces a hybrid model, quantum variational autoencoder (Q-VAE), which integrates quantum encoding within the encoder while utilizing fully connected layers to extract meaningful representations. The decoder uses transposed convolution layers for up-sampling. The Q-VAE is evaluated against the classical VAE and the classical direct-passing VAE, which utilizes windowed pooling filters. Results on the MNIST and USPS datasets demonstrate that Q-VAE consistently outperforms classical approaches, achieving lower Fréchet inception distance scores, thereby indicating superior image fidelity and enhanced reconstruction quality. These findings highlight the potential of Q-VAE for high-quality synthetic data generation and improved image reconstruction in generative models.
