Inferring the dependence graph density of binary graphical models in high dimension

Julien Chevallier; Eva Löcherbach; Guilherme Ost

Inferring the dependence graph density of binary graphical models in high dimension

Julien Chevallier, Eva Löcherbach, Guilherme Ost

TL;DR

This work studies inferring the density $p$ of a directed Erdős-Rényi interaction graph underlying a high-dimensional network of binary chains with excitatory/inhibitory mean-field coupling. It introduces simple estimators for the graph density and, under mild conditions, joint estimates of $(\mu,\lambda,p)$ with a rate of $N^{-1/2}+N^{1/2}/T+(\log T)/T^{1/2}$, justified via detailed analysis of spatio-temporal correlations. The core methodological advances are a backward-regeneration representation based on coalescing random walks and a perfect-sampling construction conditioned on the graph, which also facilitate the statistical analysis. The work provides explicit asymptotic formulas for the mean and variances (through $m,v,w$) and demonstrates the estimators’ performance via simulations, highlighting applicability to neuroscience and high-dimensional graphical modeling. Open questions include extending to sparse graph regimes and estimating the edge-set ${\cal P}_+, {\cal P}_-$ without full parameter knowledge.

Abstract

We consider a system of binary interacting chains describing the dynamics of a group of $N$ components that, at each time unit, either send some signal to the others or remain silent otherwise. The interactions among the chains are encoded by a directed Erdös-Rényi random graph with unknown parameter $ p \in (0, 1) .$ Moreover, the system is structured within two populations (excitatory chains versus inhibitory ones) which are coupled via a mean field interaction on the underlying Erdös-Rényi graph. In this paper, we address the question of inferring the connectivity parameter $p$ based only on the observation of the interacting chains over $T$ time units. In our main result, we show that the connectivity parameter $p$ can be estimated with rate $N^{-1/2}+N^{1/2}/T+(\log(T)/T)^{1/2}$ through an easy-to-compute estimator. Our analysis relies on a precise study of the spatio-temporal decay of correlations of the interacting chains. This is done through the study of coalescing random walks defining a backward regeneration representation of the system. Interestingly, we also show that this backward regeneration representation allows us to perfectly sample the system of interacting chains (conditionally on each realization of the underlying Erdös-Rényi graph) from its stationary distribution. These probabilistic results have an interest in its own.

Inferring the dependence graph density of binary graphical models in high dimension

TL;DR

This work studies inferring the density

of a directed Erdős-Rényi interaction graph underlying a high-dimensional network of binary chains with excitatory/inhibitory mean-field coupling. It introduces simple estimators for the graph density and, under mild conditions, joint estimates of

with a rate of

, justified via detailed analysis of spatio-temporal correlations. The core methodological advances are a backward-regeneration representation based on coalescing random walks and a perfect-sampling construction conditioned on the graph, which also facilitate the statistical analysis. The work provides explicit asymptotic formulas for the mean and variances (through

) and demonstrates the estimators’ performance via simulations, highlighting applicability to neuroscience and high-dimensional graphical modeling. Open questions include extending to sparse graph regimes and estimating the edge-set

without full parameter knowledge.

Abstract

We consider a system of binary interacting chains describing the dynamics of a group of

components that, at each time unit, either send some signal to the others or remain silent otherwise. The interactions among the chains are encoded by a directed Erdös-Rényi random graph with unknown parameter

Moreover, the system is structured within two populations (excitatory chains versus inhibitory ones) which are coupled via a mean field interaction on the underlying Erdös-Rényi graph. In this paper, we address the question of inferring the connectivity parameter

based only on the observation of the interacting chains over

time units. In our main result, we show that the connectivity parameter

can be estimated with rate

through an easy-to-compute estimator. Our analysis relies on a precise study of the spatio-temporal decay of correlations of the interacting chains. This is done through the study of coalescing random walks defining a backward regeneration representation of the system. Interestingly, we also show that this backward regeneration representation allows us to perfectly sample the system of interacting chains (conditionally on each realization of the underlying Erdös-Rényi graph) from its stationary distribution. These probabilistic results have an interest in its own.

Paper Structure (43 sections, 43 theorems, 479 equations, 5 figures)

This paper contains 43 sections, 43 theorems, 479 equations, 5 figures.

Introduction
Model definition, notation and main results
Heuristics for the spatio-temporal mean
Heuristics for the spatial variance
Heuristics for the temporal variance
Backward regeneration scheme
Backward regeneration representation
Coalescence of the backward random walks
Key steps and proof of Theorem \ref{['theo:1']}
Simulation study
Practical implementation
Choice of the tuning parameter
Numerical results
Proof of Theorem \ref{['thm:perfect_sampling']}
Coalescence couplings
...and 28 more sections

Key Result

Theorem 2.1

There exists a constant $K>0$ depending only on $\lambda$ such that for all $\varepsilon\in (0,1)$, $N\geq 1$, $T\geq 2$ and $1\leq \Delta\leq \lfloor T/2\rfloor$, where the vector $(m,v,w)$ is given by

Figures (5)

Figure 1: Absolute estimation error for the six estimators and their theoretical limits. The $y$-axis is in log-scale. Each line or mark correspond to a median computed over $N_{\rm simu}=100$ simulations. The panels correspond to the choices $\Delta = \log(T)$ and $\Delta = 1$ from left to right.
Figure 2: Absolute estimation error of $\hat{p}$ and its theoretical limit. Each line or mark correspond to a median computed over $N_{\rm simu}=1000$ simulations. The panels correspond to different choices of varying parameter (the non-varying parameters are chosen according to the default values given in Section \ref{['sec:numerical:results']}). The values of the varying parameter are given by the color legends.
Figure 3: Graph of the relevant transitions of the backward process $P$ used to compute a bound for $I\!\!P_\theta ( \{ z_1 \leftrightsquigarrow z_2 \})$ when $t_1\geq t_2$. The starting node is $1 \!\dagger\! 2_\infty$ if $t_1>t_2$ and $1 \!\dagger\! 2$ else. On each edge, the corresponding transition probability is given. The gray vertical line separates two temporal zones: the right one corresponds to times $t>t_2+1$ and the left one corresponds to times $t\leq t_2+1$.
Figure 4: Graph of the relevant transitions of the backward process $P$ (for times $t > t_3+1$) used to compute a bound for $I\!\!P_\theta ( \{ z_1 \leftrightsquigarrow z_2 \leftrightsquigarrow z_3 \})$ when $t_1\geq t_2 \geq t_3$. The starting node is $1 \!\dagger\! 2_\infty \!\dagger\! 2_\infty$ if $t_1>t_2>3$, $1 \!\dagger\! 2 \!\dagger\! 3_\infty$ if $t_1=t_2>t_3$ and $1 \!\dagger\! 2 \!\dagger\! 3$ (which appears in Figure \ref{['fig:graph:1|2|3:to:123:part2']}) else. On each edge, an upper bound of the corresponding transition probability is given. The gray vertical line separates two temporal zones: zone II corresponds to times $t_3+1<t\leq t_2+1$ and zone III corresponds to times $t>t_2+1$.
Figure 5: Graph of the relevant transitions of the backward process $P$ (for times $t \leq t_3+1$) used to compute a bound for $I\!\!P_\theta ( \{ z_1 \leftrightsquigarrow z_2 \leftrightsquigarrow z_3 \})$ when $t_1\geq t_2 \geq t_3$. The starting node is $1 \!\dagger\! 2_\infty \!\dagger\! 2_\infty$ (which appears in Figure \ref{['fig:graph:1|2|3:to:123:part1']}) if $t_1>t_2>3$, $1 \!\dagger\! 2 \!\dagger\! 3_\infty$ if $t_1=t_2>t_3$ and $1 \!\dagger\! 2 \!\dagger\! 3$ else. On some of the edges, an upper bound of the corresponding transition probability is given (the nodes $1\!\dagger\!23$, $13\!\dagger\!2$ and $12\!\dagger\!3$ play almost the same role that is why we omit some the transition probabilities). The gray vertical line separates two temporal zones: zone I corresponds to times $t\leq t_3+1$ and zone II corresponds to times $t_3+1<t\leq t_2+1$.

Theorems & Definitions (98)

Theorem 2.1
Proposition 2.2
Remark 2.3
Corollary 2.4
proof
Proposition 3.1
proof
Remark 3.2
Theorem 3.3
Remark 3.4
...and 88 more

Inferring the dependence graph density of binary graphical models in high dimension

TL;DR

Abstract

Inferring the dependence graph density of binary graphical models in high dimension

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (98)