Sharp threshold for network recovery from voter model dynamics
Hang Du, Seokmin Ha, Oriol Solé-Pi
TL;DR
The paper establishes a sharp, computable threshold for exactly recovering a latent directed Erdős-Rényi graph from multiple voter-model trajectories observed up to time $T$. It combines a mean-field duality to coalescing random walks with a cluster-based decoding strategy, proving that if $M\cdot\min\{T,n\} \ge C n^2 p^2 \log n$, an efficient estimator recovers $G^*$ with probability at least $0.9$, and providing a matching information-theoretic lower bound. This shows there is no statistical-computational gap in this setting and that the information carried by voter dynamics scales with the effective number of update rounds. The results are complemented by experiments on synthetic and real networks, including a Twitter interaction graph, demonstrating practical effectiveness and robustness beyond the theoretical regime. Overall, the work advances learning network structure from dynamical processes and clarifies when efficient exact recovery is possible in mean-field-like graphs.
Abstract
We investigate the problem of recovering a latent directed Erdős-Rényi graph $G^*\sim \mathcal G(n,p)$ from observations of discrete voter model trajectories on $G^*$, where $np$ grows polynomially in $n$. Given access to $M$ independent voter model trajectories evolving up to time $T$, we establish that $G^*$ can be recovered \emph{exactly} with probability at least $0.9$ by an \emph{efficient} algorithm, provided that \[ M \cdot \min\{T, n\} \geq C n^2 p^2 \log n \] holds for a sufficiently large constant $C$. Here, $M\cdot \min\{T,n\}$ can be interpreted as the approximate number of effective update rounds being observed, since the voter model on $G^*$ typically reaches consensus after $Θ(n)$ rounds, and no further information can be gained after this point. Furthermore, we prove an \emph{information-theoretic} lower bound showing that the above condition is tight up to a constant factor. Our results indicate that the recovery problem does not exhibit a statistical-computational gap.
