Adjusted Count Quantification Learning on Graphs
Clemens Damke, Eyke Hüllermeier
TL;DR
This work tackles graph quantification learning under distribution shift, where prior-probability shift assumptions often fail. It introduces Structural Importance Sampling (SIS) to account for (structural) covariate shift by reweighting training samples through density-ratio estimates on graph vertices, and Neighborhood-aware Adjusted Count (NACC) to improve class identifiability using 1-hop neighbor information. Quantification is performed by adjusting the predicted prevalences via a confuson-matrix framework with a constrained optimization on the simplex, and in graphs this is augmented by SIS and NACC. Empirical results on five graph benchmarks and a real-world Twitch Gamers dataset show that SIS (and its combination with NACC) consistently outperforms baselines under both synthetic and real-world shifts, highlighting the importance of modeling covariate shift in graph quantification and suggesting directions toward distribution-matching quantifiers and graph-quantification benchmarks.
Abstract
Quantification learning is the task of predicting the label distribution of a set of instances. We study this problem in the context of graph-structured data, where the instances are vertices. Previously, this problem has only been addressed via node clustering methods. In this paper, we extend the popular Adjusted Classify & Count (ACC) method to graphs. We show that the prior probability shift assumption upon which ACC relies is often not applicable to graph quantification problems. To address this issue, we propose structural importance sampling (SIS), the first graph quantification method that is applicable under (structural) covariate shift. Additionally, we propose Neighborhood-aware ACC, which improves quantification in the presence of non-homophilic edges. We show the effectiveness of our techniques on multiple graph quantification tasks.
