Table of Contents
Fetching ...

NCSAC: Effective Neural Community Search via Attribute-augmented Conductance

Longlong Lin, Quanao Li, Miao Qiao, Zeli Wang, Jin Zhao, Rong-Hua Li, Xin Luo, Tao Jia

TL;DR

This work tackles neural community search on attributed graphs by bridging traditional rule-based constraints with deep learning. It introduces attribute-augmented conductance, blending topology and attribute similarity to obtain a high-quality coarse candidate, then refines this candidate via a PPO-based reinforcement learning framework that learns when and how to add or remove nodes. Key contributions include the formalization and efficient computation of $(C)$, an adaptive extractor with $O(m+nk)$ complexity, a community-aware state encoder with contrastive and triplet losses, and an RL-based refiner with flexible termination and a reward shaping mechanism. Empirical results on six real graphs show consistent F1-score improvements (up to $42.4\%$) over ten baselines, establishing NCSAC as a scalable and effective solution for neural community search.

Abstract

Identifying locally dense communities closely connected to the user-initiated query node is crucial for a wide range of applications. Existing approaches either solely depend on rule-based constraints or exclusively utilize deep learning technologies to identify target communities. Therefore, an important question is proposed: can deep learning be integrated with rule-based constraints to elevate the quality of community search? In this paper, we affirmatively address this question by introducing a novel approach called Neural Community Search via Attribute-augmented Conductance, abbreviated as NCSAC. Specifically, NCSAC first proposes a novel concept of attribute-augmented conductance, which harmoniously blends the (internal and external) structural proximity and the attribute similarity. Then, NCSAC extracts a coarse candidate community of satisfactory quality using the proposed attribute-augmented conductance. Subsequently, NCSAC frames the community search as a graph optimization task, refining the candidate community through sophisticated reinforcement learning techniques, thereby producing high-quality results. Extensive experiments on six real-world graphs and ten competitors demonstrate the superiority of our solutions in terms of accuracy, efficiency, and scalability. Notably, the proposed solution outperforms state-of-the-art methods, achieving an impressive F1-score improvement ranging from 5.3\% to 42.4\%. For reproducibility purposes, the source code is available at https://github.com/longlonglin/ncsac.

NCSAC: Effective Neural Community Search via Attribute-augmented Conductance

TL;DR

This work tackles neural community search on attributed graphs by bridging traditional rule-based constraints with deep learning. It introduces attribute-augmented conductance, blending topology and attribute similarity to obtain a high-quality coarse candidate, then refines this candidate via a PPO-based reinforcement learning framework that learns when and how to add or remove nodes. Key contributions include the formalization and efficient computation of , an adaptive extractor with complexity, a community-aware state encoder with contrastive and triplet losses, and an RL-based refiner with flexible termination and a reward shaping mechanism. Empirical results on six real graphs show consistent F1-score improvements (up to ) over ten baselines, establishing NCSAC as a scalable and effective solution for neural community search.

Abstract

Identifying locally dense communities closely connected to the user-initiated query node is crucial for a wide range of applications. Existing approaches either solely depend on rule-based constraints or exclusively utilize deep learning technologies to identify target communities. Therefore, an important question is proposed: can deep learning be integrated with rule-based constraints to elevate the quality of community search? In this paper, we affirmatively address this question by introducing a novel approach called Neural Community Search via Attribute-augmented Conductance, abbreviated as NCSAC. Specifically, NCSAC first proposes a novel concept of attribute-augmented conductance, which harmoniously blends the (internal and external) structural proximity and the attribute similarity. Then, NCSAC extracts a coarse candidate community of satisfactory quality using the proposed attribute-augmented conductance. Subsequently, NCSAC frames the community search as a graph optimization task, refining the candidate community through sophisticated reinforcement learning techniques, thereby producing high-quality results. Extensive experiments on six real-world graphs and ten competitors demonstrate the superiority of our solutions in terms of accuracy, efficiency, and scalability. Notably, the proposed solution outperforms state-of-the-art methods, achieving an impressive F1-score improvement ranging from 5.3\% to 42.4\%. For reproducibility purposes, the source code is available at https://github.com/longlonglin/ncsac.

Paper Structure

This paper contains 19 sections, 5 theorems, 19 equations, 10 figures, 6 tables, 1 algorithm.

Key Result

Theorem 1

Given a graph $G$ and a community $C$. Consider a random walk $(w_t)_{t\in \mathbb{N}}$ generated by the transition matrix $\boldsymbol{P}(=\boldsymbol{D}^{-1}\boldsymbol{A})$, and the initial state $w_0$ is in $C$ (or $\bar{C}$) and is randomly chosen following the degree distribution. We have

Figures (10)

  • Figure 1: The framework of our proposed NCSAC. In the candidate community extraction stage, NCSAC systematically extracts a coarse candidate community $C_{coa}$ by leveraging our novel attribute-augmented conductance ( the red node is the query node). Subsequently, we pre-train a state encoder to effectively integrate community awareness information into the state encoding. Finally, we apply non-trivial reinforcement learning techniques to flexibly refine the coarse candidate community, ultimately yielding high-quality results (best view in color).
  • Figure 2: From the original attribute graph to the multigraph.
  • Figure 3: Illustration of Example \ref{['alg_example']} for Algorithm \ref{['PCE']}. The left side shows the update process that occurs after each node is added. On the right side, the update of $cut_a$ is elaborated using a node-attribute bipartite graph. The dark blue nodes indicate the current search space $tmp$, while the dark green nodes represent the nodes to be added (best view in color).
  • Figure 4: Offline training. Initiating the refinement process from the candidate coarse community, each step generates a new set of states, actions, and rewards, continuing until the predefined termination policy is activated. This procedure is conducted over $\tau$ episodes, during which the agent is trained utilizing the $\tau$ trajectories obtained (best review in color).
  • Figure 5: Comparison of our algorithm NCSAC with baselines. Since QD-GNN runs out of memory on Twitter and YouTube, we only report their results on the four remaining datasets (best view in color).
  • ...and 5 more figures

Theorems & Definitions (21)

  • Definition 1: Community Search k-core2k-truss3k-eccDMCSDBLP:journals/pvldb/WuJLZ15DBLP:conf/sigmod/DaiQC22DBLP:conf/kdd/YeLLLLW24DBLP:conf/focs/AndersenCL06DBLP:conf/kdd/KlosterG14DBLP:conf/aaai/LinLJ23
  • Definition 2: Conductance
  • Theorem 1
  • Proof 1
  • Definition 3: Attribute Edge
  • Definition 4: Attribute Degree
  • Definition 5: Attribute-based Random Walk
  • Example 1
  • Definition 6: Attribute-based Conductance
  • Definition 7: Attribute-augmented Conductance
  • ...and 11 more