Table of Contents
Fetching ...

ViMultiChoice: Toward a Method That Gives Explanation for Multiple-Choice Reading Comprehension in Vietnamese

Trung Tien Cao, Lam Minh Thai, Nghia Hieu Nguyen, Duc-Vu Nguyen, Ngan Luu-Thuy Nguyen

TL;DR

Addressing the lack of explainability in Vietnamese MC-MRC, the paper introduces ViMultiChoice and ViRCSoSciD, a large-scale dataset with human explanations. The model uses ViWordFormer-enhanced Vietnamese encoding, an Option Inference Module, and an Explanation Generator to jointly predict answers and generate explanations. ViRCSoSciD and ViMMRC 2.0 serve as evaluation benchmarks, with ViMultiChoice achieving SotA performance and outperforming baselines. Multitask training that combines option decision and explanation generation yields consistent gains in accuracy and explanation quality, underscoring the practical value of explainable MCRC for Vietnamese NLP.

Abstract

Multiple-choice Reading Comprehension (MCRC) models aim to select the correct answer from a set of candidate options for a given question. However, they typically lack the ability to explain the reasoning behind their choices. In this paper, we introduce a novel Vietnamese dataset designed to train and evaluate MCRC models with explanation generation capabilities. Furthermore, we propose ViMultiChoice, a new method specifically designed for modeling Vietnamese reading comprehension that jointly predicts the correct answer and generates a corresponding explanation. Experimental results demonstrate that ViMultiChoice outperforms existing MCRC baselines, achieving state-of-the-art (SotA) performance on both the ViMMRC 2.0 benchmark and the newly introduced dataset. Additionally, we show that jointly training option decision and explanation generation leads to significant improvements in multiple-choice accuracy.

ViMultiChoice: Toward a Method That Gives Explanation for Multiple-Choice Reading Comprehension in Vietnamese

TL;DR

Addressing the lack of explainability in Vietnamese MC-MRC, the paper introduces ViMultiChoice and ViRCSoSciD, a large-scale dataset with human explanations. The model uses ViWordFormer-enhanced Vietnamese encoding, an Option Inference Module, and an Explanation Generator to jointly predict answers and generate explanations. ViRCSoSciD and ViMMRC 2.0 serve as evaluation benchmarks, with ViMultiChoice achieving SotA performance and outperforming baselines. Multitask training that combines option decision and explanation generation yields consistent gains in accuracy and explanation quality, underscoring the practical value of explainable MCRC for Vietnamese NLP.

Abstract

Multiple-choice Reading Comprehension (MCRC) models aim to select the correct answer from a set of candidate options for a given question. However, they typically lack the ability to explain the reasoning behind their choices. In this paper, we introduce a novel Vietnamese dataset designed to train and evaluate MCRC models with explanation generation capabilities. Furthermore, we propose ViMultiChoice, a new method specifically designed for modeling Vietnamese reading comprehension that jointly predicts the correct answer and generates a corresponding explanation. Experimental results demonstrate that ViMultiChoice outperforms existing MCRC baselines, achieving state-of-the-art (SotA) performance on both the ViMMRC 2.0 benchmark and the newly introduced dataset. Additionally, we show that jointly training option decision and explanation generation leads to significant improvements in multiple-choice accuracy.
Paper Structure (34 sections, 32 equations, 3 figures, 11 tables)

This paper contains 34 sections, 32 equations, 3 figures, 11 tables.

Figures (3)

  • Figure 1: Statistics of questions among options, grades, and subjects in the ViRCSoSciD dataset.
  • Figure 2: Distributions of options among subjects in the ViRCSoSciD dataset.
  • Figure 3: Architecture of the ViMultiChoice method. MSA stands for Multihead Self-Attention module.