Table of Contents
Fetching ...

Hybrid Quantum Transformer for Language Generation

Desheng Kong, Xiangshuo Cui, Jiaying Jin, Jing Xu, Donglin Wang

TL;DR

HyQuT tackles the high resource cost of large language models by integrating variational quantum circuits into Transformer layers, creating a hybrid quantum–classical backbone capable of generative NLP. By implementing a three-stage quantum module with adaptive dimensionality reduction, batched quantum processing, and a quantum-enhanced self-attention mechanism, the approach achieves substantial parameter and compute reductions without sacrificing convergence or text quality. Empirical results at 8M and 150M scales show that as few as 10 qubits with 80 gates can replace roughly 10% of classical parameters in a 150M model, with stable training and coherent generation, presenting a viable near-term quantum augmentation path for NLP. The work thus provides a proof of concept for quantum-augmented generative NLP and suggests directions for scaling quantum integrations in future large-scale models.

Abstract

Although quantum computing has been increasingly applied to replace classical computation, most existing quantum or hybrid models remain confined to simple tasks, with no successful application to large-scale natural language generation to date. In this work, we present the first hybrid quantum-classical large language model (LLM) for natural language generation, HyQuT, capable of performing coherent and context-aware dialogue. The proposed architecture integrates variational quantum circuits (VQCs) into the Transformer framework at both 8M and 150M parameter scales. Experimental results show that a minimal number of qubits (10 qubits with 80 quantum gates) can replace about 10% of the classical parameters in the 150M-parameter model, while achieving comparable convergence stability and generation quality. This study provides an early demonstration of the feasibility of integrating quantum computing to large-scale generative language models.

Hybrid Quantum Transformer for Language Generation

TL;DR

HyQuT tackles the high resource cost of large language models by integrating variational quantum circuits into Transformer layers, creating a hybrid quantum–classical backbone capable of generative NLP. By implementing a three-stage quantum module with adaptive dimensionality reduction, batched quantum processing, and a quantum-enhanced self-attention mechanism, the approach achieves substantial parameter and compute reductions without sacrificing convergence or text quality. Empirical results at 8M and 150M scales show that as few as 10 qubits with 80 gates can replace roughly 10% of classical parameters in a 150M model, with stable training and coherent generation, presenting a viable near-term quantum augmentation path for NLP. The work thus provides a proof of concept for quantum-augmented generative NLP and suggests directions for scaling quantum integrations in future large-scale models.

Abstract

Although quantum computing has been increasingly applied to replace classical computation, most existing quantum or hybrid models remain confined to simple tasks, with no successful application to large-scale natural language generation to date. In this work, we present the first hybrid quantum-classical large language model (LLM) for natural language generation, HyQuT, capable of performing coherent and context-aware dialogue. The proposed architecture integrates variational quantum circuits (VQCs) into the Transformer framework at both 8M and 150M parameter scales. Experimental results show that a minimal number of qubits (10 qubits with 80 quantum gates) can replace about 10% of the classical parameters in the 150M-parameter model, while achieving comparable convergence stability and generation quality. This study provides an early demonstration of the feasibility of integrating quantum computing to large-scale generative language models.

Paper Structure

This paper contains 36 sections, 37 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: The architecture of the Overall Hybrid Architecture. The framework consists of a general-purpose projection module (middle) and its specific integration into the Transformer architecture (right).
  • Figure 2: A schematic diagram of quantum circuits, where (a) is the quantum circuit of the HyQuT-8M model and (b) is the quantum circuit of the HyQuT-150M. The main difference between the two is the entanglement strength of the variable quantum parameters and the type of rotating gate.
  • Figure 3: Loss Curve of HyQuT-8M
  • Figure 4: Loss Curve of HyQuT-150M
  • Figure 5: Qualitative Examples of Text Generated by the HyQuT-150M Model