Getting Wiser from Multiple Data: Probabilistic Updating according to Jeffrey and Pearl
Bart Jacobs
TL;DR
The paper tackles how to update beliefs when confronted with multiple pieces of evidence, comparing Jeffrey and Pearl updating rules within a discrete probabilistic framework and illustrating with a medical-test example and a water-flow analogy.It formalises multisets of evidence, defines two validity notions (Jeffrey and Pearl), and shows how the two updates yield different posteriors and predictive consequences, while connecting to channels and predictive coding.A key contribution is the explicit construction of Jeffrey-style convex mixtures of independent updates versus Pearl-style single conjunction updates, including their impact on posterior probabilities and KL-divergence, and a critical discussion of variational free energy as used in predictive coding.The findings clarify when multi-evidence learning preserves or enhances global reliability (e.g., KL-divergence reduction under Jeffrey) and caution against mixing update rules, with implications for AI decision making and cognitive theories.
Abstract
In probabilistic updating one transforms a prior distribution in the light of given evidence into a posterior distribution, via what is called conditioning, updating, belief revision or inference. This is the essence of learning, as Bayesian updating. It will be illustrated via a physical model involving (adapted) water flows through pipes with different diameters. Bayesian updating makes us wiser, in the sense that the posterior distribution makes the evidence more likely than the prior, since it incorporates the evidence. Things are less clear when one wishes to learn from multiple pieces of evidence / data. It turns out that there are (at least) two forms of updating for this, associated with Jeffrey and Pearl. The difference is not always clearly recognised. This paper provides an introduction and an overview in the setting of discrete probability theory. It starts from an elementary question, involving multiple pieces of evidence, that has been sent to a small group academic specialists. Their answers show considerable differences. This is used as motivation and starting point to introduce the two forms of updating, of Jeffrey and Pearl, for multiple inputs and to elaborate their properties. In the end the account is related to so-called variational free energy (VFE) update in the cognitive theory of predictive processing. It is shown that both Jeffrey and Pearl outperform VFE updating and that VFE updating need not decrease divergence - that is correct errors - as it is supposed to do.
