Abstract Markov Random Fields
Leon Lang, Clélia de Mulatier, Rick Quax, Patrick Forré
TL;DR
The paper generalizes information-diagram techniques from Shannon entropy to a broad class of chain-rule functions $F$ by introducing $F$-independence, $F$-mutual independence, and $F$-dual total correlation. Using the generalized Hu theorem, it constructs $F$-diagrams that visualize higher-order terms for sets of variables and proves that $F$-Markov random fields are exactly those whose $F$-diagram regions corresponding to graph-disconnected vertex sets vanish. The authors develop subset determination and the separoid framework to extend Yeung's results to arbitrary $F$, with specialized applications to probabilistic models, Kullback-Leibler diagrams on Markov chains, and the diffusion-model ELBO decomposition. They demonstrate a diagrammatic representation of a weak second law of thermodynamics and provide a simple KL-decomposition of the diffusion ELBO, illustrating the practical impact for machine learning and statistical modeling. Overall, the work unifies high-order dependence concepts under $F$-diagrams and provides a foundation for analyzing graphical-model structures with general information measures.
Abstract
Markov random fields are known to be fully characterized by properties of their information diagrams, or I-diagrams. In particular, for Markov random fields, regions in the I-diagram corresponding to disconnected vertex sets in the graph vanish. Recently, I-diagrams have been generalized to F-diagrams, for a larger class of functions F satisfying the chain rule beyond Shannon entropy, such as Kullback-Leibler divergence and cross-entropy. In this work, we generalize the notion and characterization of Markov random fields to this larger class of functions F and investigate preliminary applications. We define F-independences, F-mutual independences, and F-Markov random fields and characterize them by their F-diagram. In the process, we also define F-dual total correlation and prove that its vanishing is equivalent to F-mutual independence. We then apply our results to information functions F that are applied to probability mass functions. We show that if the probability distributions of a set of random variables are Markov random fields for the same graph, then we formally recover the notion of an F-Markov random field for that graph. We then study the Kullback-Leibler diagrams on specific Markov chains, leading to a visual representation of the second law of thermodynamics and a simple explicit derivation of the decomposition of the evidence lower bound for diffusion models.
