Table of Contents
Fetching ...

Compressing the chronology of a temporal network with graph commutators

Andrea J. Allen, Cristopher Moore, Laurent Hébert-Dufresne

TL;DR

The paper tackles how to compress temporal network chronologies without distorting dynamics, focusing on epidemic-like processes. It introduces a dynamic-preserving compression scheme that greedily merges adjacent snapshots based on a commutator-derived error measure, grounded in the linearized SI model and the Baker-Campbell-Hausdorff expansion, with the key metric $\xi_{A,B}$ capturing both chronological sensitivity and structural difference. Empirically, the method achieves substantial compression while preserving spreading dynamics on synthetic and real contact networks, outperforming even-width partitioning and MDL-based approaches. This provides a practical tool for reducing temporal-network data and computation while maintaining fidelity of the dynamics, with potential applications across synchronization, cascading failures, and other time-varying processes.

Abstract

Studies of dynamics on temporal networks often represent the network as a series of "snapshots," static networks active for short durations of time. We argue that successive snapshots can be aggregated if doing so has little effect on the overlying dynamics. We propose a method to compress network chronologies by progressively combining pairs of snapshots whose matrix commutators have the smallest dynamical effect. We apply this method to epidemic modeling on real contact tracing data and find that it allows for significant compression while remaining faithful to the epidemic dynamics.

Compressing the chronology of a temporal network with graph commutators

TL;DR

The paper tackles how to compress temporal network chronologies without distorting dynamics, focusing on epidemic-like processes. It introduces a dynamic-preserving compression scheme that greedily merges adjacent snapshots based on a commutator-derived error measure, grounded in the linearized SI model and the Baker-Campbell-Hausdorff expansion, with the key metric capturing both chronological sensitivity and structural difference. Empirically, the method achieves substantial compression while preserving spreading dynamics on synthetic and real contact networks, outperforming even-width partitioning and MDL-based approaches. This provides a practical tool for reducing temporal-network data and computation while maintaining fidelity of the dynamics, with potential applications across synchronization, cascading failures, and other time-varying processes.

Abstract

Studies of dynamics on temporal networks often represent the network as a series of "snapshots," static networks active for short durations of time. We argue that successive snapshots can be aggregated if doing so has little effect on the overlying dynamics. We propose a method to compress network chronologies by progressively combining pairs of snapshots whose matrix commutators have the smallest dynamical effect. We apply this method to epidemic modeling on real contact tracing data and find that it allows for significant compression while remaining faithful to the epidemic dynamics.
Paper Structure (8 sections, 14 equations, 4 figures)

This paper contains 8 sections, 14 equations, 4 figures.

Figures (4)

  • Figure 1: Schema of our hierarchical aggregation. Given network snapshots, we compare the aggregate spreading dynamics of each adjacent pair of snapshots and combine the pair with the lowest induced error, continuing until we reach a desired number of snapshots.
  • Figure 2: Top left: Degree distributions for two snapshots. Top right: ordinary differential equation solutions of the SI dynamics with $\beta =0.12$, $\delta t = 5$ on the temporal and aggregate versions of the snapshots with highlighted error terms. Bottom left: ODE solution difference in number of infected nodes under the temporal and aggregate regimes for varying values of $\beta t$ by varying $t=[0,5]$. Bottom right: Ranking of $\xi_{1,2}$ for snapshots 1 and 2 for increasing values of $\beta\delta_t$ compared against the integrated area between solutions.
  • Figure 3: Compression of a series of 50 synthetic network snapshots (detailed in appendix supplementary_material), showing the SI dynamics with $\beta=0.0017$. Top: we use our algorithm to compress the network history to $6$ snapshots of varying lengths, with boundaries shown by the orange dashed lines. We compare with the SI dynamics using even-width aggregation into $6$ windows of fixed length (blue dashed lines). The snapshots produced by our algorithm give SI dynamics closer to that on the full temporal network (gray). Middle: normalized distance from the temporal curve over time for each solution, $d_{\textrm{ALG}}$ for the algorithmic solution, and $d_\textrm{EVEN}$ for the even-width solution. Bottom: the vertical axis shows the shaded area from the middle panel as a function of number of aggregated snapshots, normalized by the error induced by aggregating the entire history into a single static network, measuring the error induced by aggregation as a fraction of the worst-case scenario. Generally, our algorithm results in almost ten times less error than even temporal split.
  • Figure 4: Application to empirical temporal network data. Top: the error measure $\xi_{S(t), S(t+1)}$ computed for consecutive snapshot pairs at 3 different levels of pre-aggregation on a hospital contact network vanhems_estimating_2013. The hospital contact data contains contacts for approximately $9,000$ unique timestamps. We pre-aggregate by evenly coarse-graining the data to 4,000, 1,000 and 200 snapshots. Pre-aggregation of the data into small static snapshots does not affect the overall quality of our compression supplementary_material. Middle: The error of the SI process solutions, relative to the error induced by aggregating the entire history as a single static network (full error), as a function of resulting number of aggregated snapshots. The red star shows the error induced by the MDL optimal compression, which consists of 23 snapshots for this data. The vertical axis of the inset shows the ratio of compression achieved at a given error level by our algorithm versus even-width aggregation in orange markers, and versus MDL in a red star. E.g., our algorithm can compress the data to 18 snapshots while maintaining lower error than 30 even snapshots, leading to a 30/18 ($\sim 1.7$) compression ratio. Bottom: Summary of the inset of the middle panel across other datasets Genois2018fournet2014Isella:2011qo showing the distribution of compression ratios. Our algorithm can be expected to further compress the number of snapshots by 50 to 100%. Importantly, it always outperforms MDL compression, although the two approaches can reach very similar outputs, for example on the conference dataset.