Table of Contents
Fetching ...

Random Variables, Conditional Independence and Categories of Abstract Sample Spaces

Dario Stein

TL;DR

The paper unifies Simpson's probability-sheaf approach with Markov-category probability in a synthetic setting by constructing a category of abstract sample spaces S(C) from any suitable Markov category C. It defines probability spaces P(C) and sample spaces S(C), develops an intrinsic notion of conditional independence via independent pullbacks, and shows that probability sheaves (RE(V)) live naturally as atomic sheaves on S(C), recovering Simpson’s examples in a uniform, purely synthetic framework. The independence structure is validated through IP1–IP5, and relative products provide canonical independent pullbacks, enabling a robust theory of random elements and laws within a topos of atomic sheaves. The work yields concrete instantiations in FinStoch, BorelStoch, SetMulti, Gauss, and StrongName, and connects to nominal techniques through Gauss and Nom example categories, suggesting broad applicability and future research directions in probabilistic separation logic and logical foundations. The synthesis paves the way for probability spaces, sample spaces, and random variables to be treated cohesively in a high-level categorical setting with synthetic Bayesian inversion and a shared independence calculus.

Abstract

Two high-level "pictures" of probability theory have emerged: one that takes as central the notion of random variable, and one that focuses on distributions and probability channels (Markov kernels). While the channel-based picture has been successfully axiomatized, and widely generalized, using the notion of Markov category, the categorical semantics of the random variable picture remain less clear. Simpson's probability sheaves are a recent approach, in which probabilistic concepts like random variables are allowed vary over a site of sample spaces. Simpson has identified rich structure on these sites, most notably an abstract notion of conditional independence, and given examples ranging from probability over databases to nominal sets. We aim bring this development together with the generality and abstraction of Markov categories: We show that for any suitable Markov category, a category of sample spaces can be defined which satisfies Simpson's axioms, and that a theory of probability sheaves can be developed purely synthetically in this setting. We recover Simpson's examples in a uniform fashion from well-known Markov categories, and consider further generalizations.

Random Variables, Conditional Independence and Categories of Abstract Sample Spaces

TL;DR

The paper unifies Simpson's probability-sheaf approach with Markov-category probability in a synthetic setting by constructing a category of abstract sample spaces S(C) from any suitable Markov category C. It defines probability spaces P(C) and sample spaces S(C), develops an intrinsic notion of conditional independence via independent pullbacks, and shows that probability sheaves (RE(V)) live naturally as atomic sheaves on S(C), recovering Simpson’s examples in a uniform, purely synthetic framework. The independence structure is validated through IP1–IP5, and relative products provide canonical independent pullbacks, enabling a robust theory of random elements and laws within a topos of atomic sheaves. The work yields concrete instantiations in FinStoch, BorelStoch, SetMulti, Gauss, and StrongName, and connects to nominal techniques through Gauss and Nom example categories, suggesting broad applicability and future research directions in probabilistic separation logic and logical foundations. The synthesis paves the way for probability spaces, sample spaces, and random variables to be treated cohesively in a high-level categorical setting with synthetic Bayesian inversion and a shared independence calculus.

Abstract

Two high-level "pictures" of probability theory have emerged: one that takes as central the notion of random variable, and one that focuses on distributions and probability channels (Markov kernels). While the channel-based picture has been successfully axiomatized, and widely generalized, using the notion of Markov category, the categorical semantics of the random variable picture remain less clear. Simpson's probability sheaves are a recent approach, in which probabilistic concepts like random variables are allowed vary over a site of sample spaces. Simpson has identified rich structure on these sites, most notably an abstract notion of conditional independence, and given examples ranging from probability over databases to nominal sets. We aim bring this development together with the generality and abstraction of Markov categories: We show that for any suitable Markov category, a category of sample spaces can be defined which satisfies Simpson's axioms, and that a theory of probability sheaves can be developed purely synthetically in this setting. We recover Simpson's examples in a uniform fashion from well-known Markov categories, and consider further generalizations.

Paper Structure

This paper contains 20 sections, 33 theorems, 24 equations, 1 figure.

Key Result

Proposition 1

The category $\mathbb{{P}}(\mathbb{{C}})$ is semicartesian monoidal with $(X,p) \otimes (Y,q) = (X \otimes Y, p \otimes q)$. Bayesian inversion $f \mapsto f^\dagger$ is a contravariant involutive functor on $\mathbb{{P}}(\mathbb{{C}})$ making it a dagger category (e.g. karvonen2019way)

Figures (1)

  • Figure 1: The axioms for Markov categories

Theorems & Definitions (80)

  • Definition 1
  • Example 1: Discrete probability
  • Example 2: Borel probability
  • Example 3: Gaussian probability
  • Example 4: Nondeterminism
  • Example 5: Fresh name generation
  • Definition 2: Probability spaces
  • Definition 3: Sample spaces
  • Proposition 1: fritz2019synthetic
  • Proposition 2
  • ...and 70 more