Table of Contents
Fetching ...

Lifelong Graph Learning for Graph Summarization

Jonatan Frank, Marcel Hoffmann, Nicolas Lell, David Richerby, Ansgar Scherp

TL;DR

This work extends graph summarization from static to temporal graphs by employing lifelong learning with neural models (Graph-MLP, GraphSAINT-based sampling, and an MLP baseline) to produce vertex-class summaries across ten weekly DyLDO snapshots from 2012 and 2022. It analyzes 1-hop and 2-hop summary models, measuring forward/backward transfer and forgetting as graphs evolve, and finds that 1-hop information largely suffices for 2-hop summaries, while 2-hop summaries generate substantially more EQCs and exhibit greater changes. A ten-year time warp shows that reusing models trained a decade earlier offers no clear advantage over training anew, underscoring the challenges of knowledge transfer in highly heterogeneous temporal graphs. The results emphasize that practical lifelong graph learning for summarization must account for dynamic EQC emergence/disappearance and that transfer metrics beyond accuracy are essential for evaluating model suitability in evolving networks.

Abstract

Summarizing web graphs is challenging due to the heterogeneity of the modeled information and its changes over time. We investigate the use of neural networks for lifelong graph summarization. Assuming we observe the web graph at a certain time, we train the networks to summarize graph vertices. We apply this trained network to summarize the vertices of the changed graph at the next point in time. Subsequently, we continue training and evaluating the network to perform lifelong graph summarization. We use the GNNs Graph-MLP and GraphSAINT, as well as an MLP baseline, to summarize the temporal graphs. We compare $1$-hop and $2$-hop summaries. We investigate the impact of reusing parameters from a previous snapshot by measuring the backward and forward transfer and the forgetting rate of the neural networks. Our extensive experiments on ten weekly snapshots of a web graph with over $100$M edges, sampled in 2012 and 2022, show that all networks predominantly use $1$-hop information to determine the summary, even when performing $2$-hop summarization. Due to the heterogeneity of web graphs, in some snapshots, the $2$-hop summary produces over ten times more vertex summaries than the $1$-hop summary. When using the network trained on the last snapshot from 2012 and applying it to the first snapshot of 2022, we observe a strong drop in accuracy. We attribute this drop over the ten-year time warp to the strongly increased heterogeneity of the web graph in 2022.

Lifelong Graph Learning for Graph Summarization

TL;DR

This work extends graph summarization from static to temporal graphs by employing lifelong learning with neural models (Graph-MLP, GraphSAINT-based sampling, and an MLP baseline) to produce vertex-class summaries across ten weekly DyLDO snapshots from 2012 and 2022. It analyzes 1-hop and 2-hop summary models, measuring forward/backward transfer and forgetting as graphs evolve, and finds that 1-hop information largely suffices for 2-hop summaries, while 2-hop summaries generate substantially more EQCs and exhibit greater changes. A ten-year time warp shows that reusing models trained a decade earlier offers no clear advantage over training anew, underscoring the challenges of knowledge transfer in highly heterogeneous temporal graphs. The results emphasize that practical lifelong graph learning for summarization must account for dynamic EQC emergence/disappearance and that transfer metrics beyond accuracy are essential for evaluating model suitability in evolving networks.

Abstract

Summarizing web graphs is challenging due to the heterogeneity of the modeled information and its changes over time. We investigate the use of neural networks for lifelong graph summarization. Assuming we observe the web graph at a certain time, we train the networks to summarize graph vertices. We apply this trained network to summarize the vertices of the changed graph at the next point in time. Subsequently, we continue training and evaluating the network to perform lifelong graph summarization. We use the GNNs Graph-MLP and GraphSAINT, as well as an MLP baseline, to summarize the temporal graphs. We compare -hop and -hop summaries. We investigate the impact of reusing parameters from a previous snapshot by measuring the backward and forward transfer and the forgetting rate of the neural networks. Our extensive experiments on ten weekly snapshots of a web graph with over M edges, sampled in 2012 and 2022, show that all networks predominantly use -hop information to determine the summary, even when performing -hop summarization. Due to the heterogeneity of web graphs, in some snapshots, the -hop summary produces over ten times more vertex summaries than the -hop summary. When using the network trained on the last snapshot from 2012 and applying it to the first snapshot of 2022, we observe a strong drop in accuracy. We attribute this drop over the ten-year time warp to the strongly increased heterogeneity of the web graph in 2022.
Paper Structure (38 sections, 7 equations, 14 figures, 5 tables)

This paper contains 38 sections, 7 equations, 14 figures, 5 tables.

Figures (14)

  • Figure 1: $3$-hop neighborhood of a vertex $v_1$.
  • Figure 2: Detailed analyses of the DyLDO snapshots used in our experiments. Figures (a) and (b) show the number of unique EQCs per snapshot for a $1$-hop versus $2$-hop model. Figures (c)--(f) show the changes of the EQCs (addition, deletion, and recurring) compared to the previous snapshot.
  • Figure 3: Accuracies for snapshots trained from May to July 2012. The accuracies for Graph-MLP ($2$-hop), GCN ($2$-hop), and GCN ($2$-hop and edges) are omitted because they are similar to the values of the MLP ($2$-hop).
  • Figure 4: Accuracies for snapshots trained from September to December 2022. Graph-MLP ($2$-hop), GCN ($2$-hop), and GCN ($2$-hop and edges) are omitted because they are similar to the values of the MLP ($2$-hop).
  • Figure 5: The progress of the forgetting in 2012.
  • ...and 9 more figures