Markov Categories and Entropy

Paolo Perrone

Markov Categories and Entropy

Paolo Perrone

TL;DR

The work develops a unified framework that embeds quantitative information measures into Markov categories by enriching hom-sets with divergences and metrics. It shows how classical quantities such as $D_{KL}$, Rényi $D_\alpha$, and total-variation $d_T$ yield enrichments, enabling categorical definitions of mutual information $I_D$ and entropy $H_D$ that recover Shannon, Rényi, and Gini-Simpson (linear) forms, including extensions to nondiscrete and standard Borel settings. The paper provides equivalent characterizations via joints and marginals, derives data-processing inequalities in the enriched setting, and presents explicit formulas for mutual information and entropies in both discrete and continuous contexts, along with conditional variants. It also discusses limitations in continuous spaces and outlines future directions toward geometry-aware categorical information theory and entropy notions beyond standard measurable spaces.

Abstract

Markov categories are a novel framework to describe and treat problems in probability and information theory. In this work we combine the categorical formalism with the traditional quantitative notions of entropy, mutual information, and data processing inequalities. We show that several quantitative aspects of information theory can be captured by an enriched version of Markov categories, where the spaces of morphisms are equipped with a divergence or even a metric. As it is customary in information theory, mutual information can be defined as a measure of how far a joint source is from displaying independence of its components. More strikingly, Markov categories give a notion of determinism for sources and channels, and we can define entropy exactly by measuring how far a source or channel is from being deterministic. This recovers Shannon and Rényi entropies, as well as the Gini-Simpson index used in ecology to quantify diversity, and it can be used to give a conceptual definition of generalized entropy.

Markov Categories and Entropy

TL;DR

, Rényi

, and total-variation

yield enrichments, enabling categorical definitions of mutual information

and entropy

that recover Shannon, Rényi, and Gini-Simpson (linear) forms, including extensions to nondiscrete and standard Borel settings. The paper provides equivalent characterizations via joints and marginals, derives data-processing inequalities in the enriched setting, and presents explicit formulas for mutual information and entropies in both discrete and continuous contexts, along with conditional variants. It also discusses limitations in continuous spaces and outlines future directions toward geometry-aware categorical information theory and entropy notions beyond standard measurable spaces.

Abstract

Paper Structure (37 sections, 24 theorems, 202 equations)

This paper contains 37 sections, 24 theorems, 202 equations.

Previous work on category theory and entropy.
Outline of this work.
Acknowledgements.
Background: Markov categories
Alphabets and channels.
Identities and sequential composition.
Parallel composition.
Copy and discard.
Divergences on Markov categories
Data processing and other inequalities
Characterization in terms of joints and marginals
Particular divergences
The KL divergence (relative entropy)
The Rényi or alpha-divergence enrichments.
The total variation distance
...and 22 more sections

Key Result

Proposition 2.6

A family of divergences $\{D_X\}$ on measurable sets (resp. finite sets) gives a divergence on ${\mathsf{Stoch}}$ (resp. ${\mathsf{FinStoch}}$) if and only if for all probability distributions $p,p'$ on $X$ and kernels $f,f':X\to Y$, and for each probability distribution $p,p'$ on $X$ and $q,q'$ on $A$.

Theorems & Definitions (63)

Definition 1.1
Definition 2.1
Remark 2.2
Definition 2.4
Definition 2.5
Proposition 2.6
proof
Theorem 2.7
proof : Proof of \ref{['equivenrich']}.
Corollary 2.8
...and 53 more

Markov Categories and Entropy

TL;DR

Abstract

Markov Categories and Entropy

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (63)