Table of Contents
Fetching ...

Topology-Driven Attribute Recovery for Attribute Missing Graph Learning in Social Internet of Things

Mengran Li, Junzhou Chen, Chenyun Yu, Guanying Jiang, Ronghui Zhang, Yanming Shen, Houbing Herbert Song

TL;DR

TDAR addresses missing node attributes in Text Attribute Graphs (TAGs) for Social Internet of Things (SIoT) by leveraging topology-driven propagation and embedding-space confidence. It introduces four mechanisms—Topology-Aware Attribute Propagation (TAAP), Embedding Space Propagation Confidence (ESPC), Node Homogeneity Score (NHS), and Non-Linkage Similarity Calibration (NLSC)—within a Graph Auto-Encoder backbone and a multi-task objective that fuses reconstruction with topology-informed regularizations. The approach yields superior attribute reconstruction, node classification, and clustering across six public AMG datasets, with robust performance under varying missing rates and strong homogeneity. The work advances practical AMG learning for SIoT analytics and provides code for reproducibility at the linked repository.

Abstract

With the advancement of information technology, the Social Internet of Things (SIoT) has fostered the integration of physical devices and social networks, deepening the study of complex interaction patterns. Text Attribute Graphs (TAGs) capture both topological structures and semantic attributes, enhancing the analysis of complex interactions within the SIoT. However, existing graph learning methods are typically designed for complete attributed graphs, and the common issue of missing attributes in Attribute Missing Graphs (AMGs) increases the difficulty of analysis tasks. To address this, we propose the Topology-Driven Attribute Recovery (TDAR) framework, which leverages topological data for AMG learning. TDAR introduces an improved pre-filling method for initial attribute recovery using native graph topology. Additionally, it dynamically adjusts propagation weights and incorporates homogeneity strategies within the embedding space to suit AMGs' unique topological structures, effectively reducing noise during information propagation. Extensive experiments on public datasets demonstrate that TDAR significantly outperforms state-of-the-art methods in attribute reconstruction and downstream tasks, offering a robust solution to the challenges posed by AMGs. The code is available at https://github.com/limengran98/TDAR.

Topology-Driven Attribute Recovery for Attribute Missing Graph Learning in Social Internet of Things

TL;DR

TDAR addresses missing node attributes in Text Attribute Graphs (TAGs) for Social Internet of Things (SIoT) by leveraging topology-driven propagation and embedding-space confidence. It introduces four mechanisms—Topology-Aware Attribute Propagation (TAAP), Embedding Space Propagation Confidence (ESPC), Node Homogeneity Score (NHS), and Non-Linkage Similarity Calibration (NLSC)—within a Graph Auto-Encoder backbone and a multi-task objective that fuses reconstruction with topology-informed regularizations. The approach yields superior attribute reconstruction, node classification, and clustering across six public AMG datasets, with robust performance under varying missing rates and strong homogeneity. The work advances practical AMG learning for SIoT analytics and provides code for reproducibility at the linked repository.

Abstract

With the advancement of information technology, the Social Internet of Things (SIoT) has fostered the integration of physical devices and social networks, deepening the study of complex interaction patterns. Text Attribute Graphs (TAGs) capture both topological structures and semantic attributes, enhancing the analysis of complex interactions within the SIoT. However, existing graph learning methods are typically designed for complete attributed graphs, and the common issue of missing attributes in Attribute Missing Graphs (AMGs) increases the difficulty of analysis tasks. To address this, we propose the Topology-Driven Attribute Recovery (TDAR) framework, which leverages topological data for AMG learning. TDAR introduces an improved pre-filling method for initial attribute recovery using native graph topology. Additionally, it dynamically adjusts propagation weights and incorporates homogeneity strategies within the embedding space to suit AMGs' unique topological structures, effectively reducing noise during information propagation. Extensive experiments on public datasets demonstrate that TDAR significantly outperforms state-of-the-art methods in attribute reconstruction and downstream tasks, offering a robust solution to the challenges posed by AMGs. The code is available at https://github.com/limengran98/TDAR.
Paper Structure (42 sections, 20 equations, 10 figures, 7 tables)

This paper contains 42 sections, 20 equations, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Representation of AMG and its features in a citation network of the SIoT. (a) Illustration of data missing in SIoT, (b) Illustration of the AMG, (c) Known and unknown attributes as well as structural features of AMG.
  • Figure 2: The overall framework of TDAR. In the graph $G$ with missing node attributes, the attributes of nodes are first propagated through TAAP to aggregate known attribute features from different neighborhoods, obtaining refined features $\tilde{\mathbf X}$. Then, the $\tilde{\mathbf X}$ and the normalized adjacency matrix $\hat{\mathbf A}$ are jointly fed into the GAE component to obtain latent embeddings $\mathbf Z$ and reconstructed features $\hat{\mathbf X}$. Meanwhile, ESPC dynamically adjusts embeddings based on the confidence of relationships. In addition, NHS and NLSC scores are computed to provide auxiliary information or penalties to enhance the learning process.
  • Figure 3: Comparison of convergence status and training time.
  • Figure 4: Performance comparison under different missing rates. The legend counts the average of all missing rates. Datasets Cora, Citeseer, Amac and Amap are arranged by row from top to bottom.
  • Figure 5: Comparison of homogeneity of KNN graphs constructed by features reconstructed by four methods.
  • ...and 5 more figures