Quantum Machine Learning via Contrastive Training
Liudmila A. Zhukas, Vivian Ni Zhang, Qiang Miao, Qingfeng Wang, Marko Cetina, Jungsang Kim, Lawrence Carin, Christopher Monroe
TL;DR
Quantum machine learning often suffers from limited labeled data as models scale. The paper demonstrates a hardware-implemented self-supervised pretraining via contrastive learning to learn quantum representations from unlabeled data, encoding classical images as quantum states on a trapped-ion processor and using state overlaps as a hardware-measured similarity. The two-stage pipeline—pretraining on unlabeled data and fine-tuning on a small labeled set—yields higher mean test accuracy and lower run-to-run variability than fully supervised training, especially in low-label regimes. The results, including robustness to initialization and generalization of learned invariances, suggest a practical, label-efficient pathway for quantum-native datasets and scalable quantum representations.
Abstract
Quantum machine learning (QML) has attracted growing interest with the rapid parallel advances in large-scale classical machine learning and quantum technologies. Similar to classical machine learning, QML models also face challenges arising from the scarcity of labeled data, particularly as their scale and complexity increase. Here, we introduce self-supervised pretraining of quantum representations that reduces reliance on labeled data by learning invariances from unlabeled examples. We implement this paradigm on a programmable trapped-ion quantum computer, encoding images as quantum states. In situ contrastive pretraining on hardware yields a representation that, when fine-tuned, classifies image families with higher mean test accuracy and lower run-to-run variability than models trained from random initialization. Performance improvement is especially significant in regimes with limited labeled training data. We show that the learned invariances generalize beyond the pretraining image samples. Unlike prior work, our pipeline derives similarity from measured quantum overlaps and executes all training and classification stages on hardware. These results establish a label-efficient route to quantum representation learning, with direct relevance to quantum-native datasets and a clear path to larger classical inputs.
