Meta-GCN: A Dynamically Weighted Loss Minimization Method for Dealing with the Data Imbalance in Graph Neural Networks
Mahdi Mohammadizadeh, Arash Mozhdehi, Yani Ioannou, Xin Wang
TL;DR
Meta-GCN addresses class imbalance in semi-supervised node classification on graphs by learning per-example weights $w_i$ through a meta-learning procedure that minimizes a meta-data loss on a small unbiased set while training on a weighted training loss. It introduces a meta-graph $\\mathcal{G}^{meta}$ and updates model parameters via weighted gradients, with weights derived from meta-gradients using learning rate $\\eta$ and normalized as $w_{i,t}=\\frac{\\tilde{w}_{i,t}}{(\\sum_{j}\\tilde{w}_{j,t})+$\\delta(\\sum_{j}\\tilde{w}_{j,t})}$. The approach employs a GCN backbone with normalized adjacency $\\hat{A}=\\tilde{D}^{-1/2}\\tilde{A}\\tilde{D}^{-1/2}$ and layer updates $Z^{l}=\\sigma(\\hat{A}Z^{l-1}{\\theta}^{l})$, delivering end-to-end learning. Empirical results on Haberman and Diabetes show Meta-GCN achieving higher accuracy, macro F1, and AUC-ROC than baselines, demonstrating improved minority-class discrimination and robust performance across imbalanced graph data.
Abstract
Although many real-world applications, such as disease prediction, and fault detection suffer from class imbalance, most existing graph-based classification methods ignore the skewness of the distribution of classes; therefore, tend to be biased towards the majority class(es). Conventional methods typically tackle this problem through the assignment of weights to each one of the class samples based on a function of their loss, which can lead to over-fitting on outliers. In this paper, we propose a meta-learning algorithm, named Meta-GCN, for adaptively learning the example weights by simultaneously minimizing the unbiased meta-data set loss and optimizing the model weights through the use of a small unbiased meta-data set. Through experiments, we have shown that Meta-GCN outperforms state-of-the-art frameworks and other baselines in terms of accuracy, the area under the receiver operating characteristic (AUC-ROC) curve, and macro F1-Score for classification tasks on two different datasets.
