Table of Contents
Fetching ...

Advancements and Challenges in Bangla Question Answering Models: A Comprehensive Review

Md Iftekhar Islam Tashik, Abdullah Khondoker, Enam Ahmed Taufik, Antara Firoz Parsa, S M Ishtiak Mahmud

TL;DR

The paper addresses Bangla QA systems under low-resource conditions, where annotated data and comprehensive benchmarks are scarce. It surveys seven studies spanning data collection, preprocessing, model architectures (including Seq2Seq LSTM with attention and various transformer models), and transfer-learning strategies (zero-shot and fine-tuning). Key contributions include datasets like BanglaRQA and BFQA, and cross-model analyses that illuminate the relative gains from pretrained multilingual models (e.g., BanglaBERT, BanglaT5, mT5) and similarity-based approaches. The findings highlight notable progress but also persistent gaps between Bangla and English QA and emphasize the need for larger, diverse Bengali corpora and domain-adaptive methods to improve practical language understanding.

Abstract

The domain of Natural Language Processing (NLP) has experienced notable progress in the evolution of Bangla Question Answering (QA) systems. This paper presents a comprehensive review of seven research articles that contribute to the progress in this domain. These research studies explore different aspects of creating question-answering systems for the Bangla language. They cover areas like collecting data, preparing it for analysis, designing models, conducting experiments, and interpreting results. The papers introduce innovative methods like using LSTM-based models with attention mechanisms, context-based QA systems, and deep learning techniques based on prior knowledge. However, despite the progress made, several challenges remain, including the lack of well-annotated data, the absence of high-quality reading comprehension datasets, and difficulties in understanding the meaning of words in context. Bangla QA models' precision and applicability are constrained by these challenges. This review emphasizes the significance of these research contributions by highlighting the developments achieved in creating Bangla QA systems as well as the ongoing effort required to get past roadblocks and improve the performance of these systems for actual language comprehension tasks.

Advancements and Challenges in Bangla Question Answering Models: A Comprehensive Review

TL;DR

The paper addresses Bangla QA systems under low-resource conditions, where annotated data and comprehensive benchmarks are scarce. It surveys seven studies spanning data collection, preprocessing, model architectures (including Seq2Seq LSTM with attention and various transformer models), and transfer-learning strategies (zero-shot and fine-tuning). Key contributions include datasets like BanglaRQA and BFQA, and cross-model analyses that illuminate the relative gains from pretrained multilingual models (e.g., BanglaBERT, BanglaT5, mT5) and similarity-based approaches. The findings highlight notable progress but also persistent gaps between Bangla and English QA and emphasize the need for larger, diverse Bengali corpora and domain-adaptive methods to improve practical language understanding.

Abstract

The domain of Natural Language Processing (NLP) has experienced notable progress in the evolution of Bangla Question Answering (QA) systems. This paper presents a comprehensive review of seven research articles that contribute to the progress in this domain. These research studies explore different aspects of creating question-answering systems for the Bangla language. They cover areas like collecting data, preparing it for analysis, designing models, conducting experiments, and interpreting results. The papers introduce innovative methods like using LSTM-based models with attention mechanisms, context-based QA systems, and deep learning techniques based on prior knowledge. However, despite the progress made, several challenges remain, including the lack of well-annotated data, the absence of high-quality reading comprehension datasets, and difficulties in understanding the meaning of words in context. Bangla QA models' precision and applicability are constrained by these challenges. This review emphasizes the significance of these research contributions by highlighting the developments achieved in creating Bangla QA systems as well as the ongoing effort required to get past roadblocks and improve the performance of these systems for actual language comprehension tasks.

Paper Structure

This paper contains 15 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: Text classification approach
  • Figure 2: Sample of Dataset
  • Figure 3: Data Preprocessing Process
  • Figure 4: Stop Words Example