TabularQGAN: A Quantum Generative Model for Tabular Data
Pallavi Bhardwaj, Caitlin Jones, Lasse Dierich, Aleksandar Vučković
TL;DR
TabularQGAN addresses privacy-preserving synthesis of heterogeneous tabular data by introducing a quantum GAN whose generator is a variational quantum circuit and whose encoder natively handles numerical and categorical features without autoencoding. It employs one-hot Givens-rotation encoding and a linear-scaling circuit with inter-register entanglers to capture cross-feature correlations, trained adversarially against a classical discriminator with parameter-shift gradient updates. Across MIMIC-III and Adult Census datasets, TabularQGAN achieves higher SDMetrics similarity than classical baselines while using orders of magnitude fewer parameters, and shows signs of generalization via novel-sample metrics. The work demonstrates the practical viability of quantum generative models for tabular data and motivates scaling studies toward larger qubit counts and real hardware implementations.
Abstract
In this paper, we introduce a novel quantum generative model for synthesizing tabular data. Synthetic data is valuable in scenarios where real-world data is scarce or private, it can be used to augment or replace existing datasets. Real-world enterprise data is predominantly tabular and heterogeneous, often comprising a mixture of categorical and numerical features, making it highly relevant across various industries such as healthcare, finance, and software. We propose a quantum generative adversarial network architecture with flexible data encoding and a novel quantum circuit ansatz to effectively model tabular data. The proposed approach is tested on the MIMIC III healthcare and Adult Census datasets, with extensive benchmarking against leading classical models, CTGAN, and CopulaGAN. Experimental results demonstrate that our quantum model outperforms classical models by an average of 8.5% with respect to an overall similarity score from SDMetrics, while using only 0.072% of the parameters of the classical models. Additionally, we evaluate the generalization capabilities of the models using two custom-designed metrics that demonstrate the ability of the proposed quantum model to generate useful and novel samples. To our knowledge, this is one of the first demonstrations of a successful quantum generative model for handling tabular data, indicating that this task could be well-suited to quantum computers.
