Table of Contents
Fetching ...

Getting Wiser from Multiple Data: Probabilistic Updating according to Jeffrey and Pearl

Bart Jacobs

TL;DR

The paper tackles how to update beliefs when confronted with multiple pieces of evidence, comparing Jeffrey and Pearl updating rules within a discrete probabilistic framework and illustrating with a medical-test example and a water-flow analogy.It formalises multisets of evidence, defines two validity notions (Jeffrey and Pearl), and shows how the two updates yield different posteriors and predictive consequences, while connecting to channels and predictive coding.A key contribution is the explicit construction of Jeffrey-style convex mixtures of independent updates versus Pearl-style single conjunction updates, including their impact on posterior probabilities and KL-divergence, and a critical discussion of variational free energy as used in predictive coding.The findings clarify when multi-evidence learning preserves or enhances global reliability (e.g., KL-divergence reduction under Jeffrey) and caution against mixing update rules, with implications for AI decision making and cognitive theories.

Abstract

In probabilistic updating one transforms a prior distribution in the light of given evidence into a posterior distribution, via what is called conditioning, updating, belief revision or inference. This is the essence of learning, as Bayesian updating. It will be illustrated via a physical model involving (adapted) water flows through pipes with different diameters. Bayesian updating makes us wiser, in the sense that the posterior distribution makes the evidence more likely than the prior, since it incorporates the evidence. Things are less clear when one wishes to learn from multiple pieces of evidence / data. It turns out that there are (at least) two forms of updating for this, associated with Jeffrey and Pearl. The difference is not always clearly recognised. This paper provides an introduction and an overview in the setting of discrete probability theory. It starts from an elementary question, involving multiple pieces of evidence, that has been sent to a small group academic specialists. Their answers show considerable differences. This is used as motivation and starting point to introduce the two forms of updating, of Jeffrey and Pearl, for multiple inputs and to elaborate their properties. In the end the account is related to so-called variational free energy (VFE) update in the cognitive theory of predictive processing. It is shown that both Jeffrey and Pearl outperform VFE updating and that VFE updating need not decrease divergence - that is correct errors - as it is supposed to do.

Getting Wiser from Multiple Data: Probabilistic Updating according to Jeffrey and Pearl

TL;DR

The paper tackles how to update beliefs when confronted with multiple pieces of evidence, comparing Jeffrey and Pearl updating rules within a discrete probabilistic framework and illustrating with a medical-test example and a water-flow analogy.It formalises multisets of evidence, defines two validity notions (Jeffrey and Pearl), and shows how the two updates yield different posteriors and predictive consequences, while connecting to channels and predictive coding.A key contribution is the explicit construction of Jeffrey-style convex mixtures of independent updates versus Pearl-style single conjunction updates, including their impact on posterior probabilities and KL-divergence, and a critical discussion of variational free energy as used in predictive coding.The findings clarify when multi-evidence learning preserves or enhances global reliability (e.g., KL-divergence reduction under Jeffrey) and caution against mixing update rules, with implications for AI decision making and cognitive theories.

Abstract

In probabilistic updating one transforms a prior distribution in the light of given evidence into a posterior distribution, via what is called conditioning, updating, belief revision or inference. This is the essence of learning, as Bayesian updating. It will be illustrated via a physical model involving (adapted) water flows through pipes with different diameters. Bayesian updating makes us wiser, in the sense that the posterior distribution makes the evidence more likely than the prior, since it incorporates the evidence. Things are less clear when one wishes to learn from multiple pieces of evidence / data. It turns out that there are (at least) two forms of updating for this, associated with Jeffrey and Pearl. The difference is not always clearly recognised. This paper provides an introduction and an overview in the setting of discrete probability theory. It starts from an elementary question, involving multiple pieces of evidence, that has been sent to a small group academic specialists. Their answers show considerable differences. This is used as motivation and starting point to introduce the two forms of updating, of Jeffrey and Pearl, for multiple inputs and to elaborate their properties. In the end the account is related to so-called variational free energy (VFE) update in the cognitive theory of predictive processing. It is shown that both Jeffrey and Pearl outperform VFE updating and that VFE updating need not decrease divergence - that is correct errors - as it is supposed to do.
Paper Structure (15 sections, 15 theorems, 79 equations, 3 figures)

This paper contains 15 sections, 15 theorems, 79 equations, 3 figures.

Key Result

lemma 1

Recall from ValidityKetEqn that we write $\omega\models p$ for the sum $\sum_{x} \omega(x)\cdot p(x)$, for a distribution $\omega\in\mathcal{D}(X)$ and an observation $p\in\mathsl{Obs}(X)$. Then:

Figures (3)

  • Figure 1: Posterior, updated disease distributions as water flows regulated by taps, in the style of Subsection \ref{['PhysicsSubsec']}. The prior is on the top left, and the Pearl update is on the top right. It involves successive conditionings, via successive taps, corresponding to the conjunction of predicates $\mathsl{pt} \mathrel{\&} \mathsl{pt} \mathrel{\&} \mathsl{nt}$ used for updating \ref{['PearlMedUpdate']}. The Jeffrey update, as a convex combination of conditionings, is at the bottom. The incoming flow of one liter per second is divided, in a convex sum of $\frac{2}{3}$ and $\frac{1}{3}$ liter over two pumps. The positive test tap setting is applied to the left pump and the negative test tap setting is used on the right. The resulting outgoing flows are combined in merged pipes, corresponding to the convex sum \ref{['JeffreyMedUpdate']}. The diameters of the various pipes are not precise and only give an indication.
  • Figure 2: Jeffrey and Pearl validities in the medical test scenario from the introduction, with 100 different versions of evidence $i|{}\mathsl{pt}{}\rangle + j|{}\mathsl{nt}{}\rangle$, where $i,j\in\{1,\ldots,10\}$.
  • Figure 3: Posterior disease probabilities in the medical test scenario from the introduction, for the two update mechanisms of Jeffrey and Pearl, with 100 different versions of evidence $i|{}\mathsl{pt}{}\rangle + j|{}\mathsl{nt}{}\rangle$, where $i,j\in\{1,\ldots,10\}$.

Theorems & Definitions (42)

  • remark 1
  • definition 1
  • lemma 1
  • definition 2
  • definition 3
  • definition 4
  • remark 2
  • lemma 2
  • proof
  • lemma 3
  • ...and 32 more