Toward Fair Graph Neural Networks Via Dual-Teacher Knowledge Distillation
Chengyu Li, Debo Cheng, Guixian Zhang, Yi Li, Shichao Zhang
TL;DR
This work tackles fairness in graph neural networks under partial data training by proposing FairDTD, a dual-teacher knowledge distillation framework guided by a causal graph. It uses two fairness-oriented teachers, one on node features and one on graph structure, plus graph-level distillation and node-specific temperature learning to transfer fair knowledge to a student model trained on full data. Theoretical analysis shows path-specific blocking can reduce dependence on sensitive attributes, and empirical results across Pokec-z, Pokec-n, and Credit demonstrate improved fairness with preserved utility against multiple baselines. The approach is practical for real-world graph tasks like social networks and credit scoring, offering scalable and robust fairness improvements with adaptable knowledge transfer. The use of causal modeling, dual teachers, and adaptive distillation temperatures represents a meaningful advance in fair representation learning for GNNs.
Abstract
Graph Neural Networks (GNNs) have demonstrated strong performance in graph representation learning across various real-world applications. However, they often produce biased predictions caused by sensitive attributes, such as religion or gender, an issue that has been largely overlooked in existing methods. Recently, numerous studies have focused on reducing biases in GNNs. However, these approaches often rely on training with partial data (e.g., using either node features or graph structure alone), which can enhance fairness but frequently compromises model utility due to the limited utilization of available graph information. To address this tradeoff, we propose an effective strategy to balance fairness and utility in knowledge distillation. Specifically, we introduce FairDTD, a novel Fair representation learning framework built on Dual-Teacher Distillation, leveraging a causal graph model to guide and optimize the design of the distillation process. Specifically, FairDTD employs two fairness-oriented teacher models: a feature teacher and a structure teacher, to facilitate dual distillation, with the student model learning fairness knowledge from the teachers while also leveraging full data to mitigate utility loss. To enhance information transfer, we incorporate graph-level distillation to provide an indirect supplement of graph information during training, as well as a node-specific temperature module to improve the comprehensive transfer of fair knowledge. Experiments on diverse benchmark datasets demonstrate that FairDTD achieves optimal fairness while preserving high model utility, showcasing its effectiveness in fair representation learning for GNNs.
