Efficient and Robust Continual Graph Learning for Graph Classification in Biology
Ding Zhang, Jane Downer, Can Chen, Ren Wang
TL;DR
The paper tackles graph-level continual learning for biology, where models must learn from evolving tasks without forgetting earlier knowledge. It introduces PSCGL, a framework that combines memory replay with perturbed graph sampling, motif-based sparsification, and consistency training to improve efficiency, robustness, and scalability. PSCGL not only sustains knowledge across tasks but also defends against graph backdoor attacks, achieving superior average performance and lower forgetting on Enzymes and Aromaticity, while reducing storage and computation through sparsification. This approach has practical implications for reliable biological graph analysis in dynamic settings, including drug discovery and enzyme function prediction, where data evolve and security concerns are paramount.
Abstract
Graph classification is essential for understanding complex biological systems, where molecular structures and interactions are naturally represented as graphs. Traditional graph neural networks (GNNs) perform well on static tasks but struggle in dynamic settings due to catastrophic forgetting. We present Perturbed and Sparsified Continual Graph Learning (PSCGL), a robust and efficient continual graph learning framework for graph data classification, specifically targeting biological datasets. We introduce a perturbed sampling strategy to identify critical data points that contribute to model learning and a motif-based graph sparsification technique to reduce storage needs while maintaining performance. Additionally, our PSCGL framework inherently defends against graph backdoor attacks, which is crucial for applications in sensitive biological contexts. Extensive experiments on biological datasets demonstrate that PSCGL not only retains knowledge across tasks but also enhances the efficiency and robustness of graph classification models in biology.
