Table of Contents
Fetching ...

Automating quantum feature map design via large language models

Kenya Sakka, Kosuke Mitarai, Keisuke Fujii

TL;DR

This work tackles the challenge of designing practical quantum feature maps for QSVM-based classification by introducing an agentic, LLM-driven framework that automates idea generation, validation, evaluation, and refinement of quantum feature maps. The system incorporates retrieval-augmented knowledge, code generation, and empirical kernel-based evaluation on MNIST-like data, demonstrating that dataset-adaptive feature maps can outperform several quantum baselines and approach classical kernel performance on multiple benchmarks. Key contributions include a full five-component loop (Generation, Storage, Validation, Evaluation, Review) enabling iterative improvement without internal training, and the demonstration that high-performing maps can be discovered autonomously with competitive results on MNIST, Fashion-MNIST, and CIFAR-10. The findings highlight the potential of automated quantum algorithm design to bridge theoretical promise and practical deployment, while outlining future work toward incorporating trainable components and extending to broader quantum algorithms beyond feature maps. (Mathematical relation: the quantum kernel is K(x, x') = |⟨Φ(x)|Φ(x')⟩|^2, derived from density operators ρ(x) = U(x)|0⟩⟨0|U(x)†.)

Abstract

Quantum feature maps are a key component of quantum machine learning, encoding classical data into quantum states to exploit the expressive power of high-dimensional Hilbert spaces. Despite their theoretical promise, designing quantum feature maps that offer practical advantages over classical methods remains an open challenge. In this work, we propose an agentic system that autonomously generates, evaluates, and refines quantum feature maps using large language models. The system consists of five component: Generation, Storage, Validation, Evaluation, and Review. Using these components, it iteratively improves quantum feature maps. Experiments on the MNIST dataset show that it can successfully discover and refine feature maps without human intervention. The best feature map generated outperforms existing quantum baselines and achieves competitive accuracy compared to classical kernels across MNIST, Fashion-MNIST, and CIFAR-10. Our approach provides a framework for exploring dataset-adaptive quantum features and highlights the potential of LLM-driven automation in quantum algorithm design.

Automating quantum feature map design via large language models

TL;DR

This work tackles the challenge of designing practical quantum feature maps for QSVM-based classification by introducing an agentic, LLM-driven framework that automates idea generation, validation, evaluation, and refinement of quantum feature maps. The system incorporates retrieval-augmented knowledge, code generation, and empirical kernel-based evaluation on MNIST-like data, demonstrating that dataset-adaptive feature maps can outperform several quantum baselines and approach classical kernel performance on multiple benchmarks. Key contributions include a full five-component loop (Generation, Storage, Validation, Evaluation, Review) enabling iterative improvement without internal training, and the demonstration that high-performing maps can be discovered autonomously with competitive results on MNIST, Fashion-MNIST, and CIFAR-10. The findings highlight the potential of automated quantum algorithm design to bridge theoretical promise and practical deployment, while outlining future work toward incorporating trainable components and extending to broader quantum algorithms beyond feature maps. (Mathematical relation: the quantum kernel is K(x, x') = |⟨Φ(x)|Φ(x')⟩|^2, derived from density operators ρ(x) = U(x)|0⟩⟨0|U(x)†.)

Abstract

Quantum feature maps are a key component of quantum machine learning, encoding classical data into quantum states to exploit the expressive power of high-dimensional Hilbert spaces. Despite their theoretical promise, designing quantum feature maps that offer practical advantages over classical methods remains an open challenge. In this work, we propose an agentic system that autonomously generates, evaluates, and refines quantum feature maps using large language models. The system consists of five component: Generation, Storage, Validation, Evaluation, and Review. Using these components, it iteratively improves quantum feature maps. Experiments on the MNIST dataset show that it can successfully discover and refine feature maps without human intervention. The best feature map generated outperforms existing quantum baselines and achieves competitive accuracy compared to classical kernels across MNIST, Fashion-MNIST, and CIFAR-10. Our approach provides a framework for exploring dataset-adaptive quantum features and highlights the potential of LLM-driven automation in quantum algorithm design.

Paper Structure

This paper contains 36 sections, 3 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Overview of the agentic system for automatic generation of quantum feature maps. When a user provides task instructions to the system, five internal components work collaboratively to autonomously conduct experiments and improvements. As a result, the system generates an executable program that implements a quantum feature map capable of performing the task with high accuracy.
  • Figure 2: Trajectory of classification accuracy on the MNIST dataset using quantum feature maps generated by our agentic system. The curves shown in dark colors (red and blue) represent trials with high-performance initial ideas. The curves shown in lighter colors (orange and light blue) represent an example where low-performance initial ideas. The vertical axis of the figure represents the best validation accuracy up to that trial, defined as $\max(\mathrm{accuracy}(t), \mathrm{accuracy}(t-1), \dots)$, where $\mathrm{accuracy}(t)$ denotes the best validation accuracy in the $t$-th trial. The horizontal axis corresponds to the trial number.
  • Figure 3: Trajectory of classification accuracy over the course of all 45 experiments. The values on the Y-axis represent the accuracy obtained at each trial, rather than the best accuracy. Therefore, the highest value along the Y-axis corresponds to the final best accuracy. The color bars on the right side show the trial index. The number of trials is represented by the intensity of the plotted points, with darker colors indicating later trials. The regression line represented by the solid line in the validation data is $y = 0.5164x + 0.4315$, and in the test data, it is $y = 0.5200x + 0.4276$. The black dotted line indicates the baseline; points above this line represent trials that exceeded the initial accuracy, while points below represent trials that fell short of it.
  • Figure 4: The quantum circuit of generated feature map.
  • Figure 5: The trajectories of accuracy achieved by the Agentic System are plotted for all 45 experiments. The left side shows the results on the validation data, while the right side shows the results on the test data.
  • ...and 1 more figures