Table of Contents
Fetching ...

Active Learning to Guide Labeling Efforts for Question Difficulty Estimation

Arthur Thuy, Ekaterina Loginova, Dries F. Benoit

TL;DR

This work explores active learning for QDE, a supervised human-in-the-loop approach striving to minimize the labeling efforts while matching the performance of state-of-the-art models, and proposes a novel acquisition function PowerVariance to add the most informative samples to the labeled set.

Abstract

In recent years, there has been a surge in research on Question Difficulty Estimation (QDE) using natural language processing techniques. Transformer-based neural networks achieve state-of-the-art performance, primarily through supervised methods but with an isolated study in unsupervised learning. While supervised methods focus on predictive performance, they require abundant labeled data. On the other hand, unsupervised methods do not require labeled data but rely on a different evaluation metric that is also computationally expensive in practice. This work bridges the research gap by exploring active learning for QDE, a supervised human-in-the-loop approach striving to minimize the labeling efforts while matching the performance of state-of-the-art models. The active learning process iteratively trains on a labeled subset, acquiring labels from human experts only for the most informative unlabeled data points. Furthermore, we propose a novel acquisition function PowerVariance to add the most informative samples to the labeled set, a regression extension to the PowerBALD function popular in classification. We employ DistilBERT for QDE and identify informative samples by applying Monte Carlo dropout to capture epistemic uncertainty in unlabeled samples. The experiments demonstrate that active learning with PowerVariance acquisition achieves a performance close to fully supervised models after labeling only 10% of the training data. The proposed methodology promotes the responsible use of educational resources, makes QDE tools more accessible to course instructors, and is promising for other applications such as personalized support systems and question-answering tools.

Active Learning to Guide Labeling Efforts for Question Difficulty Estimation

TL;DR

This work explores active learning for QDE, a supervised human-in-the-loop approach striving to minimize the labeling efforts while matching the performance of state-of-the-art models, and proposes a novel acquisition function PowerVariance to add the most informative samples to the labeled set.

Abstract

In recent years, there has been a surge in research on Question Difficulty Estimation (QDE) using natural language processing techniques. Transformer-based neural networks achieve state-of-the-art performance, primarily through supervised methods but with an isolated study in unsupervised learning. While supervised methods focus on predictive performance, they require abundant labeled data. On the other hand, unsupervised methods do not require labeled data but rely on a different evaluation metric that is also computationally expensive in practice. This work bridges the research gap by exploring active learning for QDE, a supervised human-in-the-loop approach striving to minimize the labeling efforts while matching the performance of state-of-the-art models. The active learning process iteratively trains on a labeled subset, acquiring labels from human experts only for the most informative unlabeled data points. Furthermore, we propose a novel acquisition function PowerVariance to add the most informative samples to the labeled set, a regression extension to the PowerBALD function popular in classification. We employ DistilBERT for QDE and identify informative samples by applying Monte Carlo dropout to capture epistemic uncertainty in unlabeled samples. The experiments demonstrate that active learning with PowerVariance acquisition achieves a performance close to fully supervised models after labeling only 10% of the training data. The proposed methodology promotes the responsible use of educational resources, makes QDE tools more accessible to course instructors, and is promising for other applications such as personalized support systems and question-answering tools.
Paper Structure (14 sections, 3 equations, 6 figures, 1 table)

This paper contains 14 sections, 3 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Active learning workflow with pool-based sampling. Active learning iteratively trains on a subset of labeled data and acquires labels from an expert annotator for samples in the unlabeled pool. Adapted from settles2009active.
  • Figure 2: Top-$K$ acquisition toy example. Acquisition scores for each unlabeled pool point are ordered and the top-$K$ points are selected.
  • Figure 3: Discrete RMSE as a function of the labeled dataset size. PowerVariance acquisition outperforms Uniform and Variance acquisition by achieving the lowest discrete RMSE scores as AL progresses. After labeling 10% of the data, its performance is close to the fully supervised model.
  • Figure 4: Active gain over Uniform acquisition as a function of the labeled dataset size. Variance acquisition performs on par with passive learning, while PowerVariance offers an active gain of 0.01 discrete RMSE.
  • Figure 5: Distribution of difficulty levels in the labeled set as a function of the labeled dataset size, per acquisition function. Similar to Variance, PowerVariance selects more level 2 observations but does not neglect level 0 samples.
  • ...and 1 more figures