Clinical Reasoning over Tabular Data and Text with Bayesian Networks
Paloma Rabaey, Johannes Deleu, Stefan Heytens, Thomas Demeester
TL;DR
This paper tackles enabling clinical reasoning over both structured tabular data and unstructured clinical text by extending Bayesian networks with neural text representations. It introduces two architectures: BN-gen-text, a generative approach modeling the text as P(T|D0,D1,S0,S1,S2) with Gaussian mixtures over BioLORD embeddings, and BN-discr-text, a discriminative approach with neural text classifiers for each parent configuration, allowing joint inference of D0 and D1 given B,S,T. In a synthetic primary-care pneumonia use-case, both approaches improve posterior diagnostic probabilities over a text-free BN, with BN-discr-text approaching the BN++ upper bound when text and symptoms are observed; ablation confirms the critical role of text-linked edges. The results demonstrate the value of preserving raw text for clinical decision support and illustrate how neuro-symbolic BN integration can enhance interpretability and robustness to missing data in medical reasoning.
Abstract
Bayesian networks are well-suited for clinical reasoning on tabular data, but are less compatible with natural language data, for which neural networks provide a successful framework. This paper compares and discusses strategies to augment Bayesian networks with neural text representations, both in a generative and discriminative manner. This is illustrated with simulation results for a primary care use case (diagnosis of pneumonia) and discussed in a broader clinical context.
