AugWard: Augmentation-Aware Representation Learning for Accurate Graph Classification
Minjun Kim, Jaehyeon Choi, SeungJoo Lee, Jinhong Jung, U Kang
TL;DR
The paper tackles overfitting in graph classification resulting from naive graph augmentation. It proposes AugWard, which uses augmentation-aware training, FGWD-based graph distance, and consistency regularization to utilize augmentation-induced differences. Experiments show state-of-the-art accuracy in supervised, semi-supervised, and transfer learning across multiple datasets, with manageable overhead (~4.9% of training time) for FGWD computation. The work demonstrates that explicitly modeling augmentation-induced differences and enforcing prediction consistency improves representation quality and transferability.
Abstract
How can we accurately classify graphs? Graph classification is a pivotal task in data mining with applications in social network analysis, web analysis, drug discovery, molecular property prediction, etc. Graph neural networks have achieved the state-of-the-art performance in graph classification, but they consistently struggle with overfitting. To mitigate overfitting, researchers have introduced various representation learning methods utilizing graph augmentation. However, existing methods rely on simplistic use of graph augmentation, which loses augmentation-induced differences and limits the expressiveness of representations. In this paper, we propose AugWard (Augmentation-Aware Training with Graph Distance and Consistency Regularization), a novel graph representation learning framework that carefully considers the diversity introduced by graph augmentation. AugWard applies augmentation-aware training to predict the graph distance between the augmented graph and its original one, aligning the representation difference directly with graph distance at both feature and structure levels. Furthermore, AugWard employs consistency regularization to encourage the classifier to handle richer representations. Experimental results show that AugWard gives the state-of-the-art performance in supervised, semi-supervised graph classification, and transfer learning.
