Core Knowledge Learning Framework for Graph Adaptation and Scalability Learning
Bowen Zhang, Zhichao Huang, Genan Dai, Guangning Xu, Xiaomao Fan, Hu Huang
TL;DR
The paper introduces Core Knowledge Learning (CKL), a framework that learns a task-relevant core subgraph to address domain shift and data scarcity in graph classification. By extracting G_sub via node/edge selection and using it for graph domain adaptation and few-shot learning, CKL achieves improved robustness and scalability compared to state-of-the-art methods. The approach integrates mutual-information-based explainability, WL-subtree kernel-based domain transfer, and bi-level optimization for few-shot tasks, demonstrating strong empirical gains across diverse graph datasets and molecular tasks. CKL further shows flexibility with different GNN backbones and kernels, highlighting its potential as a unified solution for cross-domain graph learning problems.
Abstract
Graph classification is a pivotal challenge in machine learning, especially within the realm of graph-based data, given its importance in numerous real-world applications such as social network analysis, recommendation systems, and bioinformatics. Despite its significance, graph classification faces several hurdles, including adapting to diverse prediction tasks, training across multiple target domains, and handling small-sample prediction scenarios. Current methods often tackle these challenges individually, leading to fragmented solutions that lack a holistic approach to the overarching problem. In this paper, we propose an algorithm aimed at addressing the aforementioned challenges. By incorporating insights from various types of tasks, our method aims to enhance adaptability, scalability, and generalizability in graph classification. Motivated by the recognition that the underlying subgraph plays a crucial role in GNN prediction, while the remainder is task-irrelevant, we introduce the Core Knowledge Learning (\method{}) framework for graph adaptation and scalability learning. \method{} comprises several key modules, including the core subgraph knowledge submodule, graph domain adaptation module, and few-shot learning module for downstream tasks. Each module is tailored to tackle specific challenges in graph classification, such as domain shift, label inconsistencies, and data scarcity. By learning the core subgraph of the entire graph, we focus on the most pertinent features for task relevance. Consequently, our method offers benefits such as improved model performance, increased domain adaptability, and enhanced robustness to domain variations. Experimental results demonstrate significant performance enhancements achieved by our method compared to state-of-the-art approaches.
