Table of Contents
Fetching ...

Maximum Entropy and Bayesian Conditioning Under Extended Space

Boning Yu

TL;DR

The paper addresses how to update beliefs when new information does not correspond to an event in the original probability space, focusing on Skyrms' space-extension to a product space and its alignment with Maximum Entropy (ME). It analyzes Friedman and Shimony's theorem, which suggests that such alignment imposes a degenerate constraint on future evidence, and evaluates two interpretations of this result. The author argues that the clustering of mass around the prior mean in the extended space is a coherent feature, and that the FS constraint either does not threaten ME or signals a universal limitation of any space-extension approach. The work clarifies when ME can be viewed as an extension of Bayesian updating and highlights the trade-offs and assumptions inherent in extending probability spaces.

Abstract

This paper examines the conditions under which Bayesian conditioning aligns with Maximum Entropy. Specifically, I address cases in which newly learned information does not correspond to an event in the probability space defined on the sample space of outcomes. To facilitate Bayesian conditioning in such cases, one must therefore extend the probability space so that the new information becomes an event in this expanded space. Skyrms (1985) argues that Bayesian conditioning in an extended probability space on a product space of outcomes aligns precisely with the solution from Maximum Entropy. In contrast, Seidenfeld (1986) uses Friedman and Shimony's (1971) result to criticize Skyrms' approach as trivial, suggesting that alignment holds only under a degenerate probability model. Here, I argue that Friedman and Shimony's result must either (1) be a benign consequence of Skyrms' approach, or (2) pose a universal challenge to any method of extending spaces. Accepting (2) would imply that Bayesian conditioning is incapable of accommodating information beyond the probability space defined on the original outcome space.

Maximum Entropy and Bayesian Conditioning Under Extended Space

TL;DR

The paper addresses how to update beliefs when new information does not correspond to an event in the original probability space, focusing on Skyrms' space-extension to a product space and its alignment with Maximum Entropy (ME). It analyzes Friedman and Shimony's theorem, which suggests that such alignment imposes a degenerate constraint on future evidence, and evaluates two interpretations of this result. The author argues that the clustering of mass around the prior mean in the extended space is a coherent feature, and that the FS constraint either does not threaten ME or signals a universal limitation of any space-extension approach. The work clarifies when ME can be viewed as an extension of Bayesian updating and highlights the trade-offs and assumptions inherent in extending probability spaces.

Abstract

This paper examines the conditions under which Bayesian conditioning aligns with Maximum Entropy. Specifically, I address cases in which newly learned information does not correspond to an event in the probability space defined on the sample space of outcomes. To facilitate Bayesian conditioning in such cases, one must therefore extend the probability space so that the new information becomes an event in this expanded space. Skyrms (1985) argues that Bayesian conditioning in an extended probability space on a product space of outcomes aligns precisely with the solution from Maximum Entropy. In contrast, Seidenfeld (1986) uses Friedman and Shimony's (1971) result to criticize Skyrms' approach as trivial, suggesting that alignment holds only under a degenerate probability model. Here, I argue that Friedman and Shimony's result must either (1) be a benign consequence of Skyrms' approach, or (2) pose a universal challenge to any method of extending spaces. Accepting (2) would imply that Bayesian conditioning is incapable of accommodating information beyond the probability space defined on the original outcome space.

Paper Structure

This paper contains 6 sections, 10 equations, 3 figures.

Figures (3)

  • Figure 1: The constitution of the extended sample space $\Omega_E$. The space consists of $6^N$ different sequences of outcomes, and each sequence has length $N$.
  • Figure 2: Measures of subsets of sequences with specified average values. Sequences in $\Omega_E$ are sorted according to their average values. The $\bar{x}$ column lists these average values; the $P$ column shows the probability measures of these averages, calculated via frequency counts; and the final column shows the limits these measures converge to as $N$ approaches infinity.
  • Figure 3: The left panel shows the distribution of elements (sequences) from the probability space $\left(\Omega_E, \mathcal{F}_E, \mu_E\right)$, categorized by their expectations. The right panel shows the corresponding distribution of elements (probability distributions) from the alternative probability space $\left(\Omega_E^{\prime}, \mathcal{F}_E^{\prime}, \mu_E^{\prime}\right)$. The distributions of elements represent the agent's prior over future evidence $\hat{d}_\epsilon$.