Table of Contents
Fetching ...

TCE at Qur'an QA 2023 Shared Task: Low Resource Enhanced Transformer-based Ensemble Approach for Qur'anic QA

Mohammed Alaa Elkomy, Amany Sarhan

TL;DR

This work tackles Qur’an QA in low-resource Arabic by combining transfer learning with external Arabic resources and ensemble voting across dual-encoder and cross-encoder architectures for Task A, and by fine-tuning LMs for extractive MRC with FAL and MAL learning for Task B, augmented by post-processing. It introduces faithful splits to address leakage and leverages external resources (tafseer and TyDI-QA GoldP) to boost learning, achieving a hidden-split MAP of $25.05\%$ for Task A and a hidden-split $pAP$ of $57.11\%$ for Task B, with baseline TF-IDF far behind at $9.03\%$ MAP. The main contributions are the integration of external resources, an ensemble framework, thresholding for zero-answer detection, and a careful dataset splitting strategy to improve generalization under data scarcity. This approach demonstrates tangible improvements in Arabic Qur’an QA and provides a reproducible pipeline and released code/models for the community, highlighting practical impacts for low-resource QA in highly structured religious texts.

Abstract

In this paper, we present our approach to tackle Qur'an QA 2023 shared tasks A and B. To address the challenge of low-resourced training data, we rely on transfer learning together with a voting ensemble to improve prediction stability across multiple runs. Additionally, we employ different architectures and learning mechanisms for a range of Arabic pre-trained transformer-based models for both tasks. To identify unanswerable questions, we propose using a thresholding mechanism. Our top-performing systems greatly surpass the baseline performance on the hidden split, achieving a MAP score of 25.05% for task A and a partial Average Precision (pAP) of 57.11% for task B.

TCE at Qur'an QA 2023 Shared Task: Low Resource Enhanced Transformer-based Ensemble Approach for Qur'anic QA

TL;DR

This work tackles Qur’an QA in low-resource Arabic by combining transfer learning with external Arabic resources and ensemble voting across dual-encoder and cross-encoder architectures for Task A, and by fine-tuning LMs for extractive MRC with FAL and MAL learning for Task B, augmented by post-processing. It introduces faithful splits to address leakage and leverages external resources (tafseer and TyDI-QA GoldP) to boost learning, achieving a hidden-split MAP of for Task A and a hidden-split of for Task B, with baseline TF-IDF far behind at MAP. The main contributions are the integration of external resources, an ensemble framework, thresholding for zero-answer detection, and a careful dataset splitting strategy to improve generalization under data scarcity. This approach demonstrates tangible improvements in Arabic Qur’an QA and provides a reproducible pipeline and released code/models for the community, highlighting practical impacts for low-resource QA in highly structured religious texts.

Abstract

In this paper, we present our approach to tackle Qur'an QA 2023 shared tasks A and B. To address the challenge of low-resourced training data, we rely on transfer learning together with a voting ensemble to improve prediction stability across multiple runs. Additionally, we employ different architectures and learning mechanisms for a range of Arabic pre-trained transformer-based models for both tasks. To identify unanswerable questions, we propose using a thresholding mechanism. Our top-performing systems greatly surpass the baseline performance on the hidden split, achieving a MAP score of 25.05% for task A and a partial Average Precision (pAP) of 57.11% for task B.
Paper Structure (24 sections, 18 equations, 7 figures, 9 tables)

This paper contains 24 sections, 18 equations, 7 figures, 9 tables.

Figures (7)

  • Figure 1: A sample from shared task A. We highlight the most relevant part in each Qur’anic segment.
  • Figure 2: A sample from shared task B. We highlight the ground truth answers in the Qur’anic passage.
  • Figure 3: Distribution of QRCDv1.2 over the 11 topics for task A questions and task B triplets.
  • Figure 4: Diagrams for model architectures for task A.
  • Figure 5: Generic architecture illustration of a LM for ranking MRC.
  • ...and 2 more figures