Advancements and Challenges in Bangla Question Answering Models: A Comprehensive Review
Md Iftekhar Islam Tashik, Abdullah Khondoker, Enam Ahmed Taufik, Antara Firoz Parsa, S M Ishtiak Mahmud
TL;DR
The paper addresses Bangla QA systems under low-resource conditions, where annotated data and comprehensive benchmarks are scarce. It surveys seven studies spanning data collection, preprocessing, model architectures (including Seq2Seq LSTM with attention and various transformer models), and transfer-learning strategies (zero-shot and fine-tuning). Key contributions include datasets like BanglaRQA and BFQA, and cross-model analyses that illuminate the relative gains from pretrained multilingual models (e.g., BanglaBERT, BanglaT5, mT5) and similarity-based approaches. The findings highlight notable progress but also persistent gaps between Bangla and English QA and emphasize the need for larger, diverse Bengali corpora and domain-adaptive methods to improve practical language understanding.
Abstract
The domain of Natural Language Processing (NLP) has experienced notable progress in the evolution of Bangla Question Answering (QA) systems. This paper presents a comprehensive review of seven research articles that contribute to the progress in this domain. These research studies explore different aspects of creating question-answering systems for the Bangla language. They cover areas like collecting data, preparing it for analysis, designing models, conducting experiments, and interpreting results. The papers introduce innovative methods like using LSTM-based models with attention mechanisms, context-based QA systems, and deep learning techniques based on prior knowledge. However, despite the progress made, several challenges remain, including the lack of well-annotated data, the absence of high-quality reading comprehension datasets, and difficulties in understanding the meaning of words in context. Bangla QA models' precision and applicability are constrained by these challenges. This review emphasizes the significance of these research contributions by highlighting the developments achieved in creating Bangla QA systems as well as the ongoing effort required to get past roadblocks and improve the performance of these systems for actual language comprehension tasks.
