Table of Contents
Fetching ...

Graph Analysis Using a GPU-based Parallel Algorithm: Quantum Clustering

Zhe Wang, ZhiJie He, Ding Liu

TL;DR

This work extends Quantum Clustering to graph data by formulating a potential function via a Gaussian quantum-inspired kernel and locating cluster centers with a Graph Gradient Descent procedure. The approach is implemented with GPU acceleration to efficiently compute potentials on large graphs, and is evaluated on five standard datasets against eight baselines using Modularity, ARI, FMI, and NMI. Empirical results show QC achieving competitive or superior clustering performance, with particular strength on karate club graphs and when node features are unavailable; the method remains faster than several traditional graph clustering techniques. The analysis also highlights the sensitivity to the width parameter $\\sigma$, demonstrating stable performance once $\\sigma$ is tuned, and points to broad applicability in biology, social networks, and text mining with potential for substantial speedups via GPU parallelization.

Abstract

The article introduces a new method for applying Quantum Clustering to graph structures. Quantum Clustering (QC) is a novel density-based unsupervised learning method that determines cluster centers by constructing a potential function. In this method, we use the Graph Gradient Descent algorithm to find the centers of clusters. GPU parallelization is utilized for computing potential values. We also conducted experiments on five widely used datasets and evaluated using four indicators. The results show superior performance of the method. Finally, we discuss the influence of $σ$ on the experimental results.

Graph Analysis Using a GPU-based Parallel Algorithm: Quantum Clustering

TL;DR

This work extends Quantum Clustering to graph data by formulating a potential function via a Gaussian quantum-inspired kernel and locating cluster centers with a Graph Gradient Descent procedure. The approach is implemented with GPU acceleration to efficiently compute potentials on large graphs, and is evaluated on five standard datasets against eight baselines using Modularity, ARI, FMI, and NMI. Empirical results show QC achieving competitive or superior clustering performance, with particular strength on karate club graphs and when node features are unavailable; the method remains faster than several traditional graph clustering techniques. The analysis also highlights the sensitivity to the width parameter , demonstrating stable performance once is tuned, and points to broad applicability in biology, social networks, and text mining with potential for substantial speedups via GPU parallelization.

Abstract

The article introduces a new method for applying Quantum Clustering to graph structures. Quantum Clustering (QC) is a novel density-based unsupervised learning method that determines cluster centers by constructing a potential function. In this method, we use the Graph Gradient Descent algorithm to find the centers of clusters. GPU parallelization is utilized for computing potential values. We also conducted experiments on five widely used datasets and evaluated using four indicators. The results show superior performance of the method. Finally, we discuss the influence of on the experimental results.
Paper Structure (22 sections, 9 equations, 6 figures, 1 table, 4 algorithms)

This paper contains 22 sections, 9 equations, 6 figures, 1 table, 4 algorithms.

Figures (6)

  • Figure 1: Overview of Graph Clustering;
  • Figure 2: Schematic diagram of the GGD Algorithm;
  • Figure 3: Comparison of GPU and CPU Acceleration, through experiments, we found that as the size of data increases, the time taken by the algorithm to compute the potential function on the GPU is significantly lower than the time on the CPU. This demonstrates the notable acceleration effect brought about by computing the potential function on the GPU;
  • Figure 4: The visualization of datasets for our experiment. ForceAtlas2 is used as layout demonstration algorithm. Different colors represent different clusters. (a) Cora dataset; (b) Cora-ML dataset; (c) Citeseer dataset; (d) Karate Club datset; (e) Wiki dataset; Note that since the Wiki dataset lacks actual class labels, we have opted to represent the classes in the legend using categories such as "category1", "category2", and so on;
  • Figure 5: Comparison of time consumption among algorithms: It should be noted that the GPU version of QC is represented here. Given that certain algorithms and their corresponding data points appear nearly indistinct in the figure, we have included a zoomed-in subplot on the left for a more precise comparison of time consumption. The star symbol in the upper right corner of these two algorithms, AGC* and GCC*, indicates that they share the same input as the classical algorithms, namely, the adjacency matrix, without requiring the passage of node features;
  • ...and 1 more figures