Table of Contents
Fetching ...

QUASAR: Quantum Assembly Code Generation Using Tool-Augmented LLMs via Agentic RL

Cong Yu, Valter Uotila, Shilong Deng, Qingyuan Wu, Tuo Shi, Songlin Jiang, Lei You, Bo Zhao

TL;DR

QUASAR tackles the challenge of generating accurate parameterized quantum circuits by integrating an agentic RL framework with tool-augmented LLMs and an external quantum verifier. The core innovation is a four-level hierarchical reward that first enforces syntactic validity and then aligns generated circuits with ground-truth distributions, Hamiltonian expectational values, and optimization progress. Empirical results show QUASAR dramatically improves OpenQASM circuit generation, achieving Pass@1 syntactic success of 99.31% and Pass@10 of 100%, while outperforming GPT-4o, GPT-5, and several SFT/RL baselines on QAOA/VQE-style benchmarks. The findings indicate distributional alignment is the primary driver of semantic fidelity, with optimization- and value-based rewards providing incremental gains, demonstrating the potential of tool-augmented RL for scalable quantum algorithm design.

Abstract

Designing and optimizing task-specific quantum circuits are crucial to leverage the advantage of quantum computing. Recent large language model (LLM)-based quantum circuit generation has emerged as a promising automatic solution. However, the fundamental challenges remain unaddressed: (i) parameterized quantum gates require precise numerical values for optimal performance, which also depend on multiple aspects, including the number of quantum gates, their parameters, and the layout/depth of the circuits. (ii) LLMs often generate low-quality or incorrect quantum circuits due to the lack of quantum domain-specific knowledge. We propose QUASAR, an agentic reinforcement learning (RL) framework for quantum circuits generation and optimization based on tool-augmented LLMs. To align the LLM with quantum-specific knowledge and improve the generated quantum circuits, QUASAR designs (i) a quantum circuit verification approach with external quantum simulators and (ii) a sophisticated hierarchical reward mechanism in RL training. Extensive evaluation shows improvements in both syntax and semantic performance of the generated quantum circuits. When augmenting a 4B LLM, QUASAR has achieved the validity of 99.31% in Pass@1 and 100% in Pass@10, outperforming industrial LLMs of GPT-4o, GPT-5 and DeepSeek-V3 and several supervised-fine-tuning (SFT)-only and RL-only baselines.

QUASAR: Quantum Assembly Code Generation Using Tool-Augmented LLMs via Agentic RL

TL;DR

QUASAR tackles the challenge of generating accurate parameterized quantum circuits by integrating an agentic RL framework with tool-augmented LLMs and an external quantum verifier. The core innovation is a four-level hierarchical reward that first enforces syntactic validity and then aligns generated circuits with ground-truth distributions, Hamiltonian expectational values, and optimization progress. Empirical results show QUASAR dramatically improves OpenQASM circuit generation, achieving Pass@1 syntactic success of 99.31% and Pass@10 of 100%, while outperforming GPT-4o, GPT-5, and several SFT/RL baselines on QAOA/VQE-style benchmarks. The findings indicate distributional alignment is the primary driver of semantic fidelity, with optimization- and value-based rewards providing incremental gains, demonstrating the potential of tool-augmented RL for scalable quantum algorithm design.

Abstract

Designing and optimizing task-specific quantum circuits are crucial to leverage the advantage of quantum computing. Recent large language model (LLM)-based quantum circuit generation has emerged as a promising automatic solution. However, the fundamental challenges remain unaddressed: (i) parameterized quantum gates require precise numerical values for optimal performance, which also depend on multiple aspects, including the number of quantum gates, their parameters, and the layout/depth of the circuits. (ii) LLMs often generate low-quality or incorrect quantum circuits due to the lack of quantum domain-specific knowledge. We propose QUASAR, an agentic reinforcement learning (RL) framework for quantum circuits generation and optimization based on tool-augmented LLMs. To align the LLM with quantum-specific knowledge and improve the generated quantum circuits, QUASAR designs (i) a quantum circuit verification approach with external quantum simulators and (ii) a sophisticated hierarchical reward mechanism in RL training. Extensive evaluation shows improvements in both syntax and semantic performance of the generated quantum circuits. When augmenting a 4B LLM, QUASAR has achieved the validity of 99.31% in Pass@1 and 100% in Pass@10, outperforming industrial LLMs of GPT-4o, GPT-5 and DeepSeek-V3 and several supervised-fine-tuning (SFT)-only and RL-only baselines.

Paper Structure

This paper contains 41 sections, 36 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Illustration of four possible outcomes for generated OpenQASM code: (a) fails to compile; (b) compiles but uses an incorrect number of qubits; (c) compiles with the correct number of qubits but suboptimal parameters; (d) the desired case — compiles successfully, uses the correct number of qubits, and achieves near-optimal parameters.
  • Figure 2: Example of an LLM-generated ansatz with initial parameters.
  • Figure 3: Quasar design: (a) agentic RL-quantum framework, and (b) hierarchical reward.
  • Figure 4: HQCR with varying threshold.
  • Figure 5: $\Delta E$ for valid QASMs.