Jeffrey's update rule as a minimizer of Kullback-Leibler divergence

Carlos Pinzón; Catuscia Palamidessi

Jeffrey's update rule as a minimizer of Kullback-Leibler divergence

Carlos Pinzón, Catuscia Palamidessi

TL;DR

This work provides a concise, high-level proof that Jeffrey's update rule minimizes or preserves the KL divergence $D_{\text{KL}}\left(\tau\,||\,\overrightarrow{C}(\theta)\right)$ during Bayesian updates of the parameter $\theta$. By decomposing the log-likelihood into $L(\theta)=Q(\theta|\theta_t)+H(\theta|\theta_t)$ and leveraging an EM-style argument, the authors show that the Jeffrey update $\theta_{t+1}=\overleftarrow{C_{\theta_t}}(\tau)$ maximizes $Q$ and that the nonnegative Gibbs term $\Delta H$ ensures a nonnegative $\Delta L$, which equivalently reduces the KL divergence after the update. The paper extends the argument to full-image constraints and sparsity, demonstrating that the Jeffrey posterior remains well-defined under mild positivity conditions. Overall, the result offers a streamlined, accessible proof that strengthens the theoretical understanding of Jeffrey's rule within Bayesian learning and EM frameworks. $\,$

Abstract

In this paper, we show a more concise and high level proof than the original one, derived by researcher Bart Jacobs, for the following theorem: in the context of Bayesian update rules for learning or updating internal states that produce predictions, the relative entropy between the observations and the predictions is reduced when applying Jeffrey's update rule to update the internal state.

Jeffrey's update rule as a minimizer of Kullback-Leibler divergence

TL;DR

This work provides a concise, high-level proof that Jeffrey's update rule minimizes or preserves the KL divergence

during Bayesian updates of the parameter

. By decomposing the log-likelihood into

and leveraging an EM-style argument, the authors show that the Jeffrey update

maximizes

and that the nonnegative Gibbs term

ensures a nonnegative

, which equivalently reduces the KL divergence after the update. The paper extends the argument to full-image constraints and sparsity, demonstrating that the Jeffrey posterior remains well-defined under mild positivity conditions. Overall, the result offers a streamlined, accessible proof that strengthens the theoretical understanding of Jeffrey's rule within Bayesian learning and EM frameworks.

Jeffrey's update rule as a minimizer of Kullback-Leibler divergence

TL;DR

Abstract

Jeffrey's update rule as a minimizer of Kullback-Leibler divergence

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (1)