Toward Ethical AI Through Bayesian Uncertainty in Neural Question Answering
Riccardo Di Sipio
TL;DR
The paper tackles the problem of uncalibrated confidence in neural question answering by adopting a Bayesian framework that treats model parameters as uncertain and marginalizes predictions over a posterior. It demonstrates this approach through three escalating experiments: a Bayesian MLP on Iris, a frozen-encoder QA setup with a Bayesian head on DistilBERT, and LoRA-adapted BERT with a Laplace posterior over the head on CommonsenseQA. Key contributions include showing how posterior predictive distributions enable uncertainty quantification, selective prediction, and interpretable abstention (I don't know), without pursuing state-of-the-art accuracy. The findings suggest that lightweight Bayesian methods—via MCMC on small heads or Laplace approximations on adapters—provide practical, uncertainty-aware enhancements for ethical and trustworthy QA systems.
Abstract
We explore Bayesian reasoning as a means to quantify uncertainty in neural networks for question answering. Starting with a multilayer perceptron on the Iris dataset, we show how posterior inference conveys confidence in predictions. We then extend this to language models, applying Bayesian inference first to a frozen head and finally to LoRA-adapted transformers, evaluated on the CommonsenseQA benchmark. Rather than aiming for state-of-the-art accuracy, we compare Laplace approximations against maximum a posteriori (MAP) estimates to highlight uncertainty calibration and selective prediction. This allows models to abstain when confidence is low. An ``I don't know'' response not only improves interpretability but also illustrates how Bayesian methods can contribute to more responsible and ethical deployment of neural question-answering systems.
