Bayesian Semi-structured Subspace Inference
Daniel Dold, David Rügamer, Beate Sick, Oliver Dürr
TL;DR
This work targets uncertainty quantification in semi-structured regression (SSR) models that combine interpretable structured effects with flexible unstructured neural network components. It introduces Bayesian semi-structured subspace inference, which samples the structured parameter in full space while constraining the DNN weights to a low-dimensional affine subspace defined by a Bézier curve, enabling efficient exploration of multiple loss valleys and epistemic uncertainty. Across toy, simulated, UCI, and melanoma datasets, the method yields posterior distributions for structured parameters close to full-space MCMC and competitive predictive performance, with calibration and uncertainty improving as the subspace dimension grows. The approach addresses optimization asymmetry in SSR and offers a practical, scalable framework for principled uncertainty quantification in hybrid models, with implications for medical decision support and beyond.
Abstract
Semi-structured regression models enable the joint modeling of interpretable structured and complex unstructured feature effects. The structured model part is inspired by statistical models and can be used to infer the input-output relationship for features of particular importance. The complex unstructured part defines an arbitrary deep neural network and thereby provides enough flexibility to achieve competitive prediction performance. While these models can also account for aleatoric uncertainty, there is still a lack of work on accounting for epistemic uncertainty. In this paper, we address this problem by presenting a Bayesian approximation for semi-structured regression models using subspace inference. To this end, we extend subspace inference for joint posterior sampling from a full parameter space for structured effects and a subspace for unstructured effects. Apart from this hybrid sampling scheme, our method allows for tunable complexity of the subspace and can capture multiple minima in the loss landscape. Numerical experiments validate our approach's efficacy in recovering structured effect parameter posteriors in semi-structured models and approaching the full-space posterior distribution of MCMC for increasing subspace dimension. Further, our approach exhibits competitive predictive performance across simulated and real-world datasets.
