Table of Contents
Fetching ...

Collaborative Graph Walk for Semi-supervised Multi-Label Node Classification

Uchenna Akujuobi, Han Yufei, Qiannan Zhang, Xiangliang Zhang

TL;DR

The paper tackles semi-supervised multi-label node classification on attributed graphs, where limited labeled data makes learning challenging and label dependencies must be captured.It proposes Multi-Label-Graph-Walk (MLGW), a reinforcement-learning framework with $L$ label-specific agents that perform simultaneous graph walks, utilizing a GRU-based history and a score network to select next nodes, and learning via on-policy gradient with a centralized distilled policy $\pi_d$ to encourage cross-label knowledge sharing.Key contributions include (i) a novel collaborative policy learning scheme that models label correlations through centralized regularization, (ii) a POMDP-based, end-to-end trainable architecture enabling both transductive and inductive inference, and (iii) strong empirical results on DBLP and Delve showing significant improvements over state-of-the-art baselines, supported by trajectory analyses of learned walks.Overall, the approach advances semi-supervised multi-label graph learning by enabling efficient, label-aware exploration and embedding refinement, with practical impact on real-world attributed networks.

Abstract

In this work, we study semi-supervised multi-label node classification problem in attributed graphs. Classic solutions to multi-label node classification follow two steps, first learn node embedding and then build a node classifier on the learned embedding. To improve the discriminating power of the node embedding, we propose a novel collaborative graph walk, named Multi-Label-Graph-Walk, to finely tune node representations with the available label assignments in attributed graphs via reinforcement learning. The proposed method formulates the multi-label node classification task as simultaneous graph walks conducted by multiple label-specific agents. Furthermore, policies of the label-wise graph walks are learned in a cooperative way to capture first the predictive relation between node labels and structural attributes of graphs; and second, the correlation among the multiple label-specific classification tasks. A comprehensive experimental study demonstrates that the proposed method can achieve significantly better multi-label classification performance than the state-of-the-art approaches and conduct more efficient graph exploration.

Collaborative Graph Walk for Semi-supervised Multi-Label Node Classification

TL;DR

The paper tackles semi-supervised multi-label node classification on attributed graphs, where limited labeled data makes learning challenging and label dependencies must be captured.It proposes Multi-Label-Graph-Walk (MLGW), a reinforcement-learning framework with $L$ label-specific agents that perform simultaneous graph walks, utilizing a GRU-based history and a score network to select next nodes, and learning via on-policy gradient with a centralized distilled policy $\pi_d$ to encourage cross-label knowledge sharing.Key contributions include (i) a novel collaborative policy learning scheme that models label correlations through centralized regularization, (ii) a POMDP-based, end-to-end trainable architecture enabling both transductive and inductive inference, and (iii) strong empirical results on DBLP and Delve showing significant improvements over state-of-the-art baselines, supported by trajectory analyses of learned walks.Overall, the approach advances semi-supervised multi-label graph learning by enabling efficient, label-aware exploration and embedding refinement, with practical impact on real-world attributed networks.

Abstract

In this work, we study semi-supervised multi-label node classification problem in attributed graphs. Classic solutions to multi-label node classification follow two steps, first learn node embedding and then build a node classifier on the learned embedding. To improve the discriminating power of the node embedding, we propose a novel collaborative graph walk, named Multi-Label-Graph-Walk, to finely tune node representations with the available label assignments in attributed graphs via reinforcement learning. The proposed method formulates the multi-label node classification task as simultaneous graph walks conducted by multiple label-specific agents. Furthermore, policies of the label-wise graph walks are learned in a cooperative way to capture first the predictive relation between node labels and structural attributes of graphs; and second, the correlation among the multiple label-specific classification tasks. A comprehensive experimental study demonstrates that the proposed method can achieve significantly better multi-label classification performance than the state-of-the-art approaches and conduct more efficient graph exploration.

Paper Structure

This paper contains 23 sections, 6 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: The proposed model. (a) a small attributed graph where the label-specific agent $A_i$ is currently at node $v_0$ and deciding the next visit at time $t+1$, thus $v_t=v_0$. In the left beginning of block (c), the score network $f^i_s(.;\theta_s)$ takes as input the previous history $h_{t-1}$, the current node attribute $x^t_v$, and the attributes of the immediate node and edge neighbors ${x^t_n, x^t_e}$. The choice $a_t$ of the next node $v_{t+1}$ to visit is sampled from the output of the each score network. Neighboring nodes $x^t_n$ are selected and averaged to form the immediate neighborhood aggregation $c^t_n$. The history network $f^i_h(.;\theta_t)$ takes as input the aggregated neighbor embedding $c^t_n$, the previous history $h_{t-1}$, and the current node embedding $x^t_v$; then outputs the current walk history $h_t$, as shown in (b). At time $t+1$ when the label agent moves to $v_1$, the same process repeats to move the label agent to $v_4$ at $t+2$, etc. After a number of steps, the final history vector summarizing the information obtained from the graph walk is passed to the classification network $f^i_c(.;\theta_c)$ for classifying the starting node (i.e., deciding if the node from where $A_i$ started the walk belongs to label $l_i$).
  • Figure 2: Illustration of the agent communication framework on a network with four possible labels. Learning of the centralized policy depends on the historical contexts of the walk path, the embedding of currently visited nodes, the embeddings of all the neighboring nodes and edges from each label-specific agent and the local policy model $p_i$, as shown by Eq.\ref{['eq:distilled_pg_global']}. The local policy update of each agent takes the regularization enforced by the centralized policy as defined in Eq.\ref{['eq:distilled_pg_local']}.
  • Figure 3: The average number of labels per visited node by each label agent during the graph walk
  • Figure 4: Subgraphs showing trajectories of two label agents (for label 0 and 1) using the same settings starting from the yellow node, with labels $\{0,1,2,3\}$ and terminating at the nodes with the stick figures. The black trajectory is of the label agent 1, and the blue trajectory is of label agent 0. Both explore nodes with labels that belong to the starting node, indicated in green color. Label IDs are shown in the DBLP dataset description (see section \ref{['dataset_des']}).
  • Figure 5: A heatmap whose $d$-th column demonstrates the label distribution of nodes visited by the MLGW-REG+ label agent $d$ starting from nodes with label $d$. The visiting frequency rate is shown in color. A brighter color indicates more frequent visits. It is worth mentioning that agents have no information about any label when walking, neither the label of starting node nor the label on neighboring nodes
  • ...and 1 more figures

Theorems & Definitions (1)

  • Definition 1