A Diffusive Data Augmentation Framework for Reconstruction of Complex Network Evolutionary History
En Xu, Can Rong, Jingtao Ding, Yong Li
TL;DR
This work tackles reconstructing the evolutionary history of complex networks from static topologies by reframing edge-generation-time prediction as pairwise edge-order prediction and training across multiple temporal networks. It introduces a Comparative Paradigm-based Neural Network (CPNN) and demonstrates that cross-network training markedly improves transferability, while diffusion-model-based augmentation (TopoEvoDiff) generates diverse temporal networks to further boost performance. The combined approach yields substantial gains on unseen static networks (e.g., up to a total ~22.44% improvement) and produces augmented networks with high structural fidelity (lower NRMSE than baselines). The methods enable robust reconstruction of network evolution across domains, offering practical implications for analyzing biological, ecological, and socio-economic systems where temporal data are scarce.
Abstract
The evolutionary processes of complex systems contain critical information regarding their functional characteristics. The generation time of edges provides insights into the historical evolution of various networked complex systems, such as protein-protein interaction networks, ecosystems, and social networks. Recovering these evolutionary processes holds significant scientific value, including aiding in the interpretation of the evolution of protein-protein interaction networks. However, existing methods are capable of predicting the generation times of remaining edges given a partial temporal network but often perform poorly in cross-network prediction tasks. These methods frequently fail in edge generation time recovery tasks for static networks that lack timestamps. In this work, we adopt a comparative paradigm-based framework that fuses multiple networks for training, enabling cross-network learning of the relationship between network structure and edge generation times. Compared to separate training, this approach yields an average accuracy improvement of 16.98%. Furthermore, given the difficulty in collecting temporal networks, we propose a novel diffusion-model-based generation method to produce a large number of temporal networks. By combining real temporal networks with generated ones for training, we achieve an additional average accuracy improvement of 5.46% through joint training.
