Table of Contents
Fetching ...

SCOREH+: A High-Order Node Proximity Spectral Clustering on Ratios-of-Eigenvectors Algorithm for Community Detection

Yanhui Zhu, Fang Hu, Lei Hsin Kuo, Jia liu

TL;DR

SCOREH+ introduces a high-order proximity spectral clustering framework that preserves beyond-first-neighbor information via Radial Basis Functions and Katz-based proximity. By normalizing a high-order proximity matrix and adaptively selecting the number of leading eigenvectors (with an optional (k+1)th vector for weak-signal graphs), the method improves community detection robustness in noisy networks. Extensive experiments on 11 real-world networks and numerous synthetic benchmarks show SCOREH+ achieving competitive or superior NMI and modularity relative to ASE, Louvain, Fast-Greedy, SC, SCORE, and SCORE+. The approach emphasizes well-conditioned similarity matrices, flexible RBF choices, and practical eigen-selection, offering strong performance with controllable parameter tuning that generalizes across diverse networks.

Abstract

The research on complex networks has achieved significant progress in revealing the mesoscopic features of networks. Community detection is an important aspect of understanding real-world complex systems. We present in this paper a High-order node proximity Spectral Clustering on Ratios-of-Eigenvectors (SCOREH+) algorithm for locating communities in complex networks. The algorithm improves SCORE and SCORE+ and preserves high-order transitivity information of the network affinity matrix. We optimize the high-order proximity matrix from the initial affinity matrix using the Radial Basis Functions (RBFs) and Katz index. In addition to the optimization of the Laplacian matrix, we implement a procedure that joins an additional eigenvector (the $(k+1)^{th}$ leading eigenvector) to the spectrum domain for clustering if the network is considered to be a "weak signal" graph. The algorithm has been successfully applied to both real-world and synthetic data sets. The proposed algorithm is compared with state-of-art algorithms, such as ASE, Louvain, Fast-Greedy, Spectral Clustering (SC), SCORE, and SCORE+. To demonstrate the high efficacy of the proposed method, we conducted comparison experiments on eleven real-world networks and a number of synthetic networks with noise. The experimental results in most of these networks demonstrate that SCOREH+ outperforms the baseline methods. Moreover, by tuning the RBFs and their shaping parameters, we may generate state-of-the-art community structures on all real-world networks and even on noisy synthetic networks.

SCOREH+: A High-Order Node Proximity Spectral Clustering on Ratios-of-Eigenvectors Algorithm for Community Detection

TL;DR

SCOREH+ introduces a high-order proximity spectral clustering framework that preserves beyond-first-neighbor information via Radial Basis Functions and Katz-based proximity. By normalizing a high-order proximity matrix and adaptively selecting the number of leading eigenvectors (with an optional (k+1)th vector for weak-signal graphs), the method improves community detection robustness in noisy networks. Extensive experiments on 11 real-world networks and numerous synthetic benchmarks show SCOREH+ achieving competitive or superior NMI and modularity relative to ASE, Louvain, Fast-Greedy, SC, SCORE, and SCORE+. The approach emphasizes well-conditioned similarity matrices, flexible RBF choices, and practical eigen-selection, offering strong performance with controllable parameter tuning that generalizes across diverse networks.

Abstract

The research on complex networks has achieved significant progress in revealing the mesoscopic features of networks. Community detection is an important aspect of understanding real-world complex systems. We present in this paper a High-order node proximity Spectral Clustering on Ratios-of-Eigenvectors (SCOREH+) algorithm for locating communities in complex networks. The algorithm improves SCORE and SCORE+ and preserves high-order transitivity information of the network affinity matrix. We optimize the high-order proximity matrix from the initial affinity matrix using the Radial Basis Functions (RBFs) and Katz index. In addition to the optimization of the Laplacian matrix, we implement a procedure that joins an additional eigenvector (the leading eigenvector) to the spectrum domain for clustering if the network is considered to be a "weak signal" graph. The algorithm has been successfully applied to both real-world and synthetic data sets. The proposed algorithm is compared with state-of-art algorithms, such as ASE, Louvain, Fast-Greedy, Spectral Clustering (SC), SCORE, and SCORE+. To demonstrate the high efficacy of the proposed method, we conducted comparison experiments on eleven real-world networks and a number of synthetic networks with noise. The experimental results in most of these networks demonstrate that SCOREH+ outperforms the baseline methods. Moreover, by tuning the RBFs and their shaping parameters, we may generate state-of-the-art community structures on all real-world networks and even on noisy synthetic networks.
Paper Structure (45 sections, 12 equations, 13 figures, 15 tables, 2 algorithms)

This paper contains 45 sections, 12 equations, 13 figures, 15 tables, 2 algorithms.

Figures (13)

  • Figure 1: Flowchart of SCOREH+ Model. This model consists of three phases: i) the high-order proximity matrix extraction from the original graph using RBF and Katz; ii) eigen-decomposition from the normalized Laplacian from the high-proximity matrix; iii) eigen-selection (lines 8 - 9 of Algorithm \ref{['algo:SCOREH+']}) and clustering. A step-by-step computation of this toy example is included in Appendix \ref{['app:example']}.
  • Figure 2: Optimal shaping parameters with RBF choices on real-world networks.
  • Figure 3: The topological displays for the Karate network from SCORE, SCORE+, and our SCOREH+ algorithms (\ref{['fig:karate:a']} is plotted from the ground truth of the network for comparison. \ref{['fig:karate:b']}, \ref{['fig:karate:c']}) and \ref{['fig:karate:d']} are from the node labels discovered by SCORE, SCORE+, and our SCOREH+, respectively. The similar settings apply to Fig. \ref{['fig:uk']}.
  • Figure 4: The topological displays for the UKfaculty network from SCORE, SCORE+, and our SCOREH+ algorithms (\ref{['fig:uk:a']} is from the ground-truth of the network).
  • Figure 5: The topological displays for Caltech (Fig. \ref{['fig:real:a']}), Football (Fig. \ref{['fig:real:b']}), Blog (Fig. \ref{['fig:real:c']}), and Simmons (Fig. \ref{['fig:real:d']}), respectively, from the results of our SCOREH+ algorithms.
  • ...and 8 more figures