Table of Contents
Fetching ...

SplaXBERT: Leveraging Mixed Precision Training and Context Splitting for Question Answering

Zhu Yufan, Hao Zeyu, Li Siqi, Niu Boqian

TL;DR

SplaXBERT addresses long-document QA by combining context-splitting with mixed-precision fine-tuning on ALBERT-xlarge, enabling efficient extractive QA on SQuAD v1.1. The approach achieves competitive Exact Match and F1 scores while reducing training time and memory usage, demonstrating practical gains for resource-constrained QA tasks. Key contributions include a principled overlapping-context splitting strategy, a robust mixed-precision training regimen, and empirical gains over BERT-based baselines. The work highlights its potential for scalable, efficient QA in real-world, long-document scenarios and outlines future directions for broader model exploration and optimization.

Abstract

SplaXBERT, built on ALBERT-xlarge with context-splitting and mixed precision training, achieves high efficiency in question-answering tasks on lengthy texts. Tested on SQuAD v1.1, it attains an Exact Match of 85.95% and an F1 Score of 92.97%, outperforming traditional BERT-based models in both accuracy and resource efficiency.

SplaXBERT: Leveraging Mixed Precision Training and Context Splitting for Question Answering

TL;DR

SplaXBERT addresses long-document QA by combining context-splitting with mixed-precision fine-tuning on ALBERT-xlarge, enabling efficient extractive QA on SQuAD v1.1. The approach achieves competitive Exact Match and F1 scores while reducing training time and memory usage, demonstrating practical gains for resource-constrained QA tasks. Key contributions include a principled overlapping-context splitting strategy, a robust mixed-precision training regimen, and empirical gains over BERT-based baselines. The work highlights its potential for scalable, efficient QA in real-world, long-document scenarios and outlines future directions for broader model exploration and optimization.

Abstract

SplaXBERT, built on ALBERT-xlarge with context-splitting and mixed precision training, achieves high efficiency in question-answering tasks on lengthy texts. Tested on SQuAD v1.1, it attains an Exact Match of 85.95% and an F1 Score of 92.97%, outperforming traditional BERT-based models in both accuracy and resource efficiency.

Paper Structure

This paper contains 31 sections, 1 equation, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Overview of the BERT Base QA pipeline
  • Figure 2: Grid Search Results: Exact Match Score by Context Length and Overlap Stride
  • Figure 3: Grid Search Results: F1 Score by Context Length and Overlap Stride
  • Figure 4: Overview of the SplaXBERT QA pipeline
  • Figure 5: Overview of the BERT Base fine-tuning QA pipeline, showing the flow of data from input processing to answer extraction.
  • ...and 1 more figures