Bayesian--AI Fusion for Epidemiological Decision Making: Calibrated Risk, Honest Uncertainty, and Hyperparameter Intelligence
Debashis Chatterjee
TL;DR
This work introduces a two-layer Bayesian–AI framework for epidemiological decision making, pairing a Bayesian predictive layer that delivers calibrated individual risk with credible intervals and a Bayesian optimization layer that treats hyperparameter tuning and model selection as probabilistic inference. Through two running examples—Bayesian logistic regression for diabetes risk and Gaussian-process Bayesian optimization for Cox survival models—the approach demonstrates improved calibration, honest uncertainty quantification, and near-oracle concordance in survival tuning. Simulation studies across low- and high-dimensional regimes show that the Bayesian layer preserves discrimination while delivering trustworthy uncertainty, and that Bayesian optimization consistently steers models toward better performance with quantified uncertainty. Real-data analyses on Pima Indians Diabetes and GBSG2 breast cancer data illustrate practical implications for cost-sensitive screening and interpretable survival modelling, highlighting the framework’s potential to enhance calibrated risk, robustness, and decision-focused evaluation in epidemiological AI.
Abstract
Modern epidemiological analytics increasingly use machine learning models that offer strong prediction but often lack calibrated uncertainty. Bayesian methods provide principled uncertainty quantification, yet are viewed as difficult to integrate with contemporary AI workflows. This paper proposes a unified Bayesian and AI framework that combines Bayesian prediction with Bayesian hyperparameter optimization. We use Bayesian logistic regression to obtain calibrated individual-level disease risk and credible intervals on the Pima Indians Diabetes dataset. In parallel, we use Gaussian-process Bayesian optimization to tune penalized Cox survival models on the GBSG2 breast cancer cohort. This yields a two-layer system: a Bayesian predictive layer that represents risk as a posterior distribution, and a Bayesian optimization layer that treats model selection as inference over a black-box objective. Simulation studies in low- and high-dimensional regimes show that the Bayesian layer provides reliable coverage and improved calibration, while Bayesian shrinkage improves AUC, Brier score, and log-loss. Bayesian optimization consistently pushes survival models toward near-oracle concordance. Overall, Bayesian reasoning enhances both what we infer and how we search, enabling calibrated risk and principled hyperparameter intelligence for epidemiological decision making.
