Table of Contents
Fetching ...

Toward Ethical AI Through Bayesian Uncertainty in Neural Question Answering

Riccardo Di Sipio

TL;DR

The paper tackles the problem of uncalibrated confidence in neural question answering by adopting a Bayesian framework that treats model parameters as uncertain and marginalizes predictions over a posterior. It demonstrates this approach through three escalating experiments: a Bayesian MLP on Iris, a frozen-encoder QA setup with a Bayesian head on DistilBERT, and LoRA-adapted BERT with a Laplace posterior over the head on CommonsenseQA. Key contributions include showing how posterior predictive distributions enable uncertainty quantification, selective prediction, and interpretable abstention (I don't know), without pursuing state-of-the-art accuracy. The findings suggest that lightweight Bayesian methods—via MCMC on small heads or Laplace approximations on adapters—provide practical, uncertainty-aware enhancements for ethical and trustworthy QA systems.

Abstract

We explore Bayesian reasoning as a means to quantify uncertainty in neural networks for question answering. Starting with a multilayer perceptron on the Iris dataset, we show how posterior inference conveys confidence in predictions. We then extend this to language models, applying Bayesian inference first to a frozen head and finally to LoRA-adapted transformers, evaluated on the CommonsenseQA benchmark. Rather than aiming for state-of-the-art accuracy, we compare Laplace approximations against maximum a posteriori (MAP) estimates to highlight uncertainty calibration and selective prediction. This allows models to abstain when confidence is low. An ``I don't know'' response not only improves interpretability but also illustrates how Bayesian methods can contribute to more responsible and ethical deployment of neural question-answering systems.

Toward Ethical AI Through Bayesian Uncertainty in Neural Question Answering

TL;DR

The paper tackles the problem of uncalibrated confidence in neural question answering by adopting a Bayesian framework that treats model parameters as uncertain and marginalizes predictions over a posterior. It demonstrates this approach through three escalating experiments: a Bayesian MLP on Iris, a frozen-encoder QA setup with a Bayesian head on DistilBERT, and LoRA-adapted BERT with a Laplace posterior over the head on CommonsenseQA. Key contributions include showing how posterior predictive distributions enable uncertainty quantification, selective prediction, and interpretable abstention (I don't know), without pursuing state-of-the-art accuracy. The findings suggest that lightweight Bayesian methods—via MCMC on small heads or Laplace approximations on adapters—provide practical, uncertainty-aware enhancements for ethical and trustworthy QA systems.

Abstract

We explore Bayesian reasoning as a means to quantify uncertainty in neural networks for question answering. Starting with a multilayer perceptron on the Iris dataset, we show how posterior inference conveys confidence in predictions. We then extend this to language models, applying Bayesian inference first to a frozen head and finally to LoRA-adapted transformers, evaluated on the CommonsenseQA benchmark. Rather than aiming for state-of-the-art accuracy, we compare Laplace approximations against maximum a posteriori (MAP) estimates to highlight uncertainty calibration and selective prediction. This allows models to abstain when confidence is low. An ``I don't know'' response not only improves interpretability but also illustrates how Bayesian methods can contribute to more responsible and ethical deployment of neural question-answering systems.

Paper Structure

This paper contains 7 sections, 2 equations, 8 figures.

Figures (8)

  • Figure 1: One–dimensional priors (gray) and marginalized posteriors (black) for selected parameters. Posteriors concentrate and shift relative to priors as the data updates beliefs.
  • Figure 2: Two–dimensional marginalized posteriors with credible-region contours (68/95/99.7%). Geometry reveals how uncertainty couples parameters.
  • Figure 3: Posterior predictive per sample (mean $\pm$ 1$\sigma$). Black dots = means; blue circle = predicted class; red star = true class.
  • Figure 4: System-level evaluation. Left: calibration (confidence vs. empirical accuracy). Right: accuracy when abstaining below a confidence threshold.
  • Figure 5: Posterior predictive distributions for selected entries in the custom three-class question answering dataset. Black points and error bars show the mean and $\pm 1\sigma$ uncertainty across posterior samples, the blue dot marks the predicted class, and the red star denotes the true label.
  • ...and 3 more figures