How do correlations shape the landscape of information?

Ching-Peng Huang

How do correlations shape the landscape of information?

Ching-Peng Huang

TL;DR

The paper investigates how correlations shape the landscape of information about a stimulus $S$ when two observations $X$ and $Y$ have fixed marginals, focusing on the mutual information $I(S:X,Y)$ and its decomposition into synergistic and non-synergistic components. It adopts an algebraic statistics and information-geometry approach, introducing the correlation domain $oldsymbol{ riangle}_P$, the shuffle distribution $Q^0$, and Segre-variety geometry to classify when the information is minimised inside the domain or on its boundary, aided by discriminants and toric-chamber considerations. A key contribution is showing how a series-expansion-based decomposition of $I$ aligns with the BROJA Partial Information Decomposition, providing a translation between these viewpoints and clarifying the roles of $I_{ci}$, $I_{cd}$, and $I_{Q^*}$. The work also uses Gaussian-mixture-model intuition to illustrate the correspondence between covariance structure and information, and outlines open questions and broader implications for interdisciplinary dialogue between mathematics and theoretical neuroscience.

Abstract

We explore a few common models on how correlations affect information. The main model considered is the Shannon mutual information $I(S:R_1,\cdots, R_i)$ over distributions with marginals $P_{S,R_i}$ fixed for each $i$, with the analogy in which $S$ is the stimulus and $R_i$'s are neurons. We work out basic models in details, using algebro-geometric tools to write down discriminants that separate distributions with distinct qualitative behaviours in the probability simplex into toric chambers and evaluate the volumes of them algebraically. As a byproduct, we provide direct translation between a decomposition of mutual information inspired by a series expansion and one from partial information decomposition (PID) problems, characterising the synergistic terms of the former. We hope this paper serves for communication between communities especially mathematics and theoretical neuroscience on the topic. KEYWORDS: information theory, algebraic statistics, mathematical neuroscience, partial information decomposition

How do correlations shape the landscape of information?

TL;DR

The paper investigates how correlations shape the landscape of information about a stimulus

when two observations

and

have fixed marginals, focusing on the mutual information

and its decomposition into synergistic and non-synergistic components. It adopts an algebraic statistics and information-geometry approach, introducing the correlation domain

, the shuffle distribution

, and Segre-variety geometry to classify when the information is minimised inside the domain or on its boundary, aided by discriminants and toric-chamber considerations. A key contribution is showing how a series-expansion-based decomposition of

aligns with the BROJA Partial Information Decomposition, providing a translation between these viewpoints and clarifying the roles of

, and

. The work also uses Gaussian-mixture-model intuition to illustrate the correspondence between covariance structure and information, and outlines open questions and broader implications for interdisciplinary dialogue between mathematics and theoretical neuroscience.

Abstract

We explore a few common models on how correlations affect information. The main model considered is the Shannon mutual information

over distributions with marginals

fixed for each

, with the analogy in which

is the stimulus and

's are neurons. We work out basic models in details, using algebro-geometric tools to write down discriminants that separate distributions with distinct qualitative behaviours in the probability simplex into toric chambers and evaluate the volumes of them algebraically. As a byproduct, we provide direct translation between a decomposition of mutual information inspired by a series expansion and one from partial information decomposition (PID) problems, characterising the synergistic terms of the former. We hope this paper serves for communication between communities especially mathematics and theoretical neuroscience on the topic. KEYWORDS: information theory, algebraic statistics, mathematical neuroscience, partial information decomposition

Paper Structure (29 sections, 28 theorems, 133 equations, 4 figures, 1 table)

This paper contains 29 sections, 28 theorems, 133 equations, 4 figures, 1 table.

Introduction: battles between synergy and redundancy
Landscape of mutual information
Over the domain of correlations
Parametrising and orienting the correlation domain
Binomial correlations
Bivariate binary source model
Algebraic information geometry
Linear conditions of fixed marginals
Segre variety of independence distributions
Mixture models as point configurations
Level set of binomial correlation
Discriminants on shuffle distributions
Prospects
Eyeball heuristics for Gaussian mixture models
Open questions and discussion
...and 14 more sections

Key Result

Lemma 2.3

Let $P$ be a marginal distribution as the above.

Figures (4)

Figure 1: Landscape of mutual information when the marginals are independent.
Figure 2: The region of distributions that gives an interior minimum (shaded) given a fixed distribution of coordinate $P = (s,t)$. The limiting lines are labelled with their slopes. The two rulings through $P$ are in dashed lines, separating parities of signal correlation.
Figure 3: Illustration for overlapping Gaussians. Not fully up to scale. Re-drawn by hands for better layout as the actual elongations should be a lot more extreme. Left: No noise correlation. This is analogous to the shuffle distribution $Q^0$. Middle: Noise correlations "align" with the signal correlation, resulting in (only intuitively) maximal overlap. The is analogous to $Q^*$ in the discrete case. Right: When noise correlations increase even more (past $Q^*$ and near $Q^{Max}$), the confidence regions separate.
Figure 4: Illustration comparing the two decompositions. Not actual function. Hand-drawn in MS PPT.

Theorems & Definitions (65)

Example 2.1
Example 2.2
Definition 1
Definition 2
Lemma 2.3
Proposition 2.4
Lemma 2.5
Example 2.6
Definition 3
Lemma 2.7
...and 55 more

How do correlations shape the landscape of information?

TL;DR

Abstract

How do correlations shape the landscape of information?

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (65)