Table of Contents
Fetching ...

GroverGPT: A Large Language Model with 8 Billion Parameters for Quantum Searching

Haoran Wang, Pingzhi Li, Min Chen, Jinglei Cheng, Junyu Liu, Tianlong Chen

TL;DR

The paper addresses the feasibility of classically simulating quantum algorithms by using task-specific LLMs to emulate Grover's search. It introduces GroverGPT, an 8B-parameter LLaMA-based model trained on 15T+ tokens with multimodal inputs (quantum circuits, QASM, and natural language) to output probability distributions that mimic quantum amplitudes without explicit state representations, achieving superior accuracy to GPT-4o on quantum-search tasks. The study shows robust generalization from training on small qubits to larger systems (up to $20$ qubits) and demonstrates that GroverGPT captures genuine quantum features, not just classical patterns, though accuracy degrades with increasing system size. It highlights the importance of prompting strategies (QASM + conversation) and data diversity for robustness and suggests that task-specific LLMs can advance quantum algorithm research, while outlining future work on noisy devices, larger qubit counts, and quantum error-correction modeling.

Abstract

Quantum computing is an exciting non-Von Neumann paradigm, offering provable speedups over classical computing for specific problems. However, the practical limits of classical simulatability for quantum circuits remain unclear, especially with current noisy quantum devices. In this work, we explore the potential of leveraging Large Language Models (LLMs) to simulate the output of a quantum Turing machine using Grover's quantum circuits, known to provide quadratic speedups over classical counterparts. To this end, we developed GroverGPT, a specialized model based on LLaMA's 8-billion-parameter architecture, trained on over 15 trillion tokens. Unlike brute-force state-vector simulations, which demand substantial computational resources, GroverGPT employs pattern recognition to approximate quantum search algorithms without explicitly representing quantum states. Analyzing 97K quantum search instances, GroverGPT consistently outperformed OpenAI's GPT-4o (45\% accuracy), achieving nearly 100\% accuracy on 6- and 10-qubit datasets when trained on 4-qubit or larger datasets. It also demonstrated strong generalization, surpassing 95\% accuracy for systems with over 20 qubits when trained on 3- to 6-qubit data. Analysis indicates GroverGPT captures quantum features of Grover's search rather than classical patterns, supported by novel prompting strategies to enhance performance. Although accuracy declines with increasing system size, these findings offer insights into the practical boundaries of classical simulatability. This work suggests task-specific LLMs can surpass general-purpose models like GPT-4o in quantum algorithm learning and serve as powerful tools for advancing quantum research.

GroverGPT: A Large Language Model with 8 Billion Parameters for Quantum Searching

TL;DR

The paper addresses the feasibility of classically simulating quantum algorithms by using task-specific LLMs to emulate Grover's search. It introduces GroverGPT, an 8B-parameter LLaMA-based model trained on 15T+ tokens with multimodal inputs (quantum circuits, QASM, and natural language) to output probability distributions that mimic quantum amplitudes without explicit state representations, achieving superior accuracy to GPT-4o on quantum-search tasks. The study shows robust generalization from training on small qubits to larger systems (up to qubits) and demonstrates that GroverGPT captures genuine quantum features, not just classical patterns, though accuracy degrades with increasing system size. It highlights the importance of prompting strategies (QASM + conversation) and data diversity for robustness and suggests that task-specific LLMs can advance quantum algorithm research, while outlining future work on noisy devices, larger qubit counts, and quantum error-correction modeling.

Abstract

Quantum computing is an exciting non-Von Neumann paradigm, offering provable speedups over classical computing for specific problems. However, the practical limits of classical simulatability for quantum circuits remain unclear, especially with current noisy quantum devices. In this work, we explore the potential of leveraging Large Language Models (LLMs) to simulate the output of a quantum Turing machine using Grover's quantum circuits, known to provide quadratic speedups over classical counterparts. To this end, we developed GroverGPT, a specialized model based on LLaMA's 8-billion-parameter architecture, trained on over 15 trillion tokens. Unlike brute-force state-vector simulations, which demand substantial computational resources, GroverGPT employs pattern recognition to approximate quantum search algorithms without explicitly representing quantum states. Analyzing 97K quantum search instances, GroverGPT consistently outperformed OpenAI's GPT-4o (45\% accuracy), achieving nearly 100\% accuracy on 6- and 10-qubit datasets when trained on 4-qubit or larger datasets. It also demonstrated strong generalization, surpassing 95\% accuracy for systems with over 20 qubits when trained on 3- to 6-qubit data. Analysis indicates GroverGPT captures quantum features of Grover's search rather than classical patterns, supported by novel prompting strategies to enhance performance. Although accuracy declines with increasing system size, these findings offer insights into the practical boundaries of classical simulatability. This work suggests task-specific LLMs can surpass general-purpose models like GPT-4o in quantum algorithm learning and serve as powerful tools for advancing quantum research.
Paper Structure (18 sections, 18 equations, 5 figures)

This paper contains 18 sections, 18 equations, 5 figures.

Figures (5)

  • Figure 1: Overview. We investigate the classical simulation of quantum search through GroverGPT, a large language model approach. Starting from quantum search's implementations via quantum machines and classical simulation, we evaluate GroverGPT along four dimensions: effectiveness of quantum search simulation, generalization from small to large qubit systems, comparative analysis between quantum and classical approaches, and the role of prompt engineering. Through these investigations, GroverGPT demonstrates promising capabilities in bridging quantum-classical computational boundaries.
  • Figure 2: Overview of GroverGPT's pre-training pipeline. From left to right: (1) Data generation begins with implementing Grover's algorithm on a simulated quantum machine, repeating the circuit $\mathcal{O}(\sqrt{N})$ times to construct comprehensive training data where $N=2^n$ and $n$ is the number of qubits. (2) The measurement outcomes are collected, represented by probability distributions across different computational basis states (shown in color-coded bars for different qubit configurations). (3) The corresponding QASM code is generated to provide standardized circuit descriptions. (4) These components are combined through augmented training, integrating both quantum circuit information and measurement data to pre-train the GroverGPT model, which builds upon the Llama-3.1-8B vavekanand2024llama architecture.
  • Figure 3: Performance evaluation and generalization capability of GroverGPT.(a) Distribution of the pre-training dataset comprising $97$K quantum search examples across different qubit sizes from $3$ to $20$. (b) Training loss curves for different GroverGPT variants, showing convergence behavior during pre-training on 3-6 qubit datasets. (c) Comparative accuracy analysis of GPT-4o, various open-source large models, GroverGPT, and its variants with QASM and conversation components on $6$-qubit (left) and $10$-qubit (right) test sets across different training qubit ranges. (d)Infidelity$\epsilon$ comparison between models on $6$-qubit (left) and $10$-qubit (right) test sets, demonstrating the error reduction as training qubit count increases. (e) Scalability assessment of GroverGPT trained on $3\sim 6$ qubits, showing accuracy (blue line) and Marked Infidelity (blue bars) when tested on larger systems ranging from $6$ to $20$ qubits, highlighting the LLM's generalization capabilities beyond its training domain.(f) (g) (h) Model performance across a wide range of hyper-parameters (LoRA rank, batch size, learning rate, respectively), highlighting accuracy as a robust indicator.
  • Figure 4: The model's robustness across diverse training datasets on different qubit systems illustrates its sensitivity to hyper-parameters during finetuning.
  • Figure 5: Plotted circuit of Grover's searching algorithm implemented using Qiskit under 3 qubits.