Semi-supervised Node Importance Estimation with Informative Distribution Modeling for Uncertainty Regularization
Yankai Chen, Taotao Wang, Yixiang Fang, Yunyu Xiao
TL;DR
This work addresses the problem of estimating continuous node importance in heterogeneous graphs when ground-truth labels are partially available. It introduces EASING, a semi-supervised framework that explicitly models uncertainty via a Distribution-based Joint Estimator (DJE) to jointly predict node importance and its uncertainty, and to generate high-quality pseudo-labels for unlabeled nodes. The learning objective combines labeled and pseudo-labeled data under a heteroscedastic regression regime, enabling uncertainty-aware regularization and robust training. Empirical results on three real-world datasets show that EASING consistently outperforms strong baselines on both value estimation and ranking, with improved robustness to limited labeled data and strong compatibility with other graph models. The approach promises practical impact for applications requiring reliable node significance measures under scarce labeling and heterogeneous information.
Abstract
Node importance estimation, a classical problem in network analysis, underpins various web applications. Previous methods either exploit intrinsic topological characteristics, e.g., graph centrality, or leverage additional information, e.g., data heterogeneity, for node feature enhancement. However, these methods follow the supervised learning setting, overlooking the fact that ground-truth node-importance data are usually partially labeled in practice. In this work, we propose the first semi-supervised node importance estimation framework, i.e., EASING, to improve learning quality for unlabeled data in heterogeneous graphs. Different from previous approaches, EASING explicitly captures uncertainty to reflect the confidence of model predictions. To jointly estimate the importance values and uncertainties, EASING incorporates DJE, a deep encoder-decoder neural architecture. DJE introduces distribution modeling for graph nodes, where the distribution representations derive both importance and uncertainty estimates. Additionally, DJE facilitates effective pseudo-label generation for the unlabeled data to enrich the training samples. Based on labeled and pseudo-labeled data, EASING develops effective semi-supervised heteroscedastic learning with varying node uncertainty regularization. Extensive experiments on three real-world datasets highlight the superior performance of EASING compared to competing methods. Codes are available via https://github.com/yankai-chen/EASING.
