Table of Contents
Fetching ...

A Hybrid Classical-Quantum Fine Tuned BERT for Text Classification

Abu Kaisar Mohammad Masum, Naveed Mahmud, M. Hassan Najafi, Sercan Aygun

TL;DR

The paper investigates enhancing BERT for text classification by coupling it with an $n_q$-qubit variational quantum circuit in a hybrid classical-quantum framework. The approach uses an angle-encoded quantum layer inserted after a pooling of BERT's 768-dim representation, followed by a shallow VQC and a classical head trained with mean-squared error. Experiments on IMDb, Spam, SST, Yelp, and Twitter show competitive performance versus classical baselines, with notable gains on several datasets, while highlighting training-time and scalability challenges due to quantum circuit simulation. The work demonstrates feasibility and provides a roadmap for integrating quantum components into large pre-trained language models for NLP tasks.

Abstract

Fine-tuning BERT for text classification can be computationally challenging and requires careful hyper-parameter tuning. Recent studies have highlighted the potential of quantum algorithms to outperform conventional methods in machine learning and text classification tasks. In this work, we propose a hybrid approach that integrates an n-qubit quantum circuit with a classical BERT model for text classification. We evaluate the performance of the fine-tuned classical-quantum BERT and demonstrate its feasibility as well as its potential in advancing this research area. Our experimental results show that the proposed hybrid model achieves performance that is competitive with, and in some cases better than, the classical baselines on standard benchmark datasets. Furthermore, our approach demonstrates the adaptability of classical-quantum models for fine-tuning pre-trained models across diverse datasets. Overall, the hybrid model highlights the promise of quantum computing in achieving improved performance for text classification tasks.

A Hybrid Classical-Quantum Fine Tuned BERT for Text Classification

TL;DR

The paper investigates enhancing BERT for text classification by coupling it with an -qubit variational quantum circuit in a hybrid classical-quantum framework. The approach uses an angle-encoded quantum layer inserted after a pooling of BERT's 768-dim representation, followed by a shallow VQC and a classical head trained with mean-squared error. Experiments on IMDb, Spam, SST, Yelp, and Twitter show competitive performance versus classical baselines, with notable gains on several datasets, while highlighting training-time and scalability challenges due to quantum circuit simulation. The work demonstrates feasibility and provides a roadmap for integrating quantum components into large pre-trained language models for NLP tasks.

Abstract

Fine-tuning BERT for text classification can be computationally challenging and requires careful hyper-parameter tuning. Recent studies have highlighted the potential of quantum algorithms to outperform conventional methods in machine learning and text classification tasks. In this work, we propose a hybrid approach that integrates an n-qubit quantum circuit with a classical BERT model for text classification. We evaluate the performance of the fine-tuned classical-quantum BERT and demonstrate its feasibility as well as its potential in advancing this research area. Our experimental results show that the proposed hybrid model achieves performance that is competitive with, and in some cases better than, the classical baselines on standard benchmark datasets. Furthermore, our approach demonstrates the adaptability of classical-quantum models for fine-tuning pre-trained models across diverse datasets. Overall, the hybrid model highlights the promise of quantum computing in achieving improved performance for text classification tasks.

Paper Structure

This paper contains 18 sections, 4 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Angle encoding, variational circuit, and the measurement steps in an example VQC.
  • Figure 2: Overview of our approach for a hybrid classical-quantum BERT model.
  • Figure 3: The graph shows training time variations in a classical-QC model across datasets (IMDb, Spam, SST, Yelp, Twitter) with different qubit values. IMDb, Spam, and SST maintain stable times, while Yelp and Twitter see increased times with more qubits. Key values: IMDb (2 qubits: 2760s, 10 qubits: 3900s), Spam (2 qubits: 2880s, 10 qubits: 4140s), SST (2 qubits: 2988s, 10 qubits: 4740s), Yelp (2 qubits: 3126s, 10 qubits: 8612s), Twitter (2 qubits: 3480s, 10 qubits: 8910s). Insights into qubit count and training time relationships reveal dataset-specific differences in the classical-QC model.