Clustering-Oriented Generative Attribute Graph Imputation
Mulin Chen, Bocheng Wang, Jiaxin Zhong, Zongcheng Miao, Xuelong Li
TL;DR
This work tackles attribute-missing graph clustering by jointly performing clustering-aware imputation and edge-oriented refinement. It introduces CGIR, which first learns subcluster distributions to constrain generative imputation and then applies an Edge Attention Network with contrastive learning to identify edge-wise attributes for reliable graph reconstruction. The approach combines a subcluster-aware generator, a discriminator, and an edge-attentional refinement module, trained with an alternating adversarial objective and a subcluster regularizer. Empirical results on four benchmarks demonstrate robust clustering performance under substantial attribute missingness, with ablations confirming the utility of subcluster modeling and edge-focused refinement for practical unsupervised graph clustering.
Abstract
Attribute-missing graph clustering has emerged as a significant unsupervised task, where only attribute vectors of partial nodes are available and the graph structure is intact. The related models generally follow the two-step paradigm of imputation and refinement. However, most imputation approaches fail to capture class-relevant semantic information, leading to sub-optimal imputation for clustering. Moreover, existing refinement strategies optimize the learned embedding through graph reconstruction, while neglecting the fact that some attributes are uncorrelated with the graph. To remedy the problems, we establish the Clustering-oriented Generative Imputation with reliable Refinement (CGIR) model. Concretely, the subcluster distributions are estimated to reveal the class-specific characteristics precisely, and constrain the sampling space of the generative adversarial module, such that the imputation nodes are impelled to align with the correct clusters. Afterwards, multiple subclusters are merged to guide the proposed edge attention network, which identifies the edge-wise attributes for each class, so as to avoid the redundant attributes in graph reconstruction from disturbing the refinement of overall embedding. To sum up, CGIR splits attribute-missing graph clustering into the search and mergence of subclusters, which guides to implement node imputation and refinement within a unified framework. Extensive experiments prove the advantages of CGIR over state-of-the-art competitors.
