Rethinking Independent Cross-Entropy Loss For Graph-Structured Data
Rui Miao, Kaixiong Zhou, Yili Wang, Ninghao Liu, Ying Wang, Xin Wang
TL;DR
This work addresses the mismatch between graph-structured data and the independent cross-entropy loss by introducing joint-cluster supervised learning, which models the joint distribution $p(y_i, ar{y}_m | z_i, ar{z}_m; \theta)$ between a node and its cluster. It trains end-to-end with a joint-cluster cross-entropy loss that uses cluster-level reference signals and infers node labels via marginalization over the cluster dimension, thereby reducing over-confident predictions and improving robustness. Extensive experiments across small and large graphs, including imbalanced and heterophilic settings, show consistent accuracy gains and enhanced resilience to adversarial attacks, with favorable efficiency relative to CRF-based approaches. The method leverages METIS clustering to capture community structure and demonstrates calibration improvements, offering a scalable path to more reliable GNNs in real-world graph applications.
Abstract
Graph neural networks (GNNs) have exhibited prominent performance in learning graph-structured data. Considering node classification task, based on the i.i.d assumption among node labels, the traditional supervised learning simply sums up cross-entropy losses of the independent training nodes and applies the average loss to optimize GNNs' weights. But different from other data formats, the nodes are naturally connected. It is found that the independent distribution modeling of node labels restricts GNNs' capability to generalize over the entire graph and defend adversarial attacks. In this work, we propose a new framework, termed joint-cluster supervised learning, to model the joint distribution of each node with its corresponding cluster. We learn the joint distribution of node and cluster labels conditioned on their representations, and train GNNs with the obtained joint loss. In this way, the data-label reference signals extracted from the local cluster explicitly strengthen the discrimination ability on the target node. The extensive experiments demonstrate that our joint-cluster supervised learning can effectively bolster GNNs' node classification accuracy. Furthermore, being benefited from the reference signals which may be free from spiteful interference, our learning paradigm significantly protects the node classification from being affected by the adversarial attack.
