Table of Contents
Fetching ...

Cover Edge-Based Novel Triangle Counting

David A. Bader, Fuhuan Li, Zhihui Du, Palina Pauliuchenka, Oliver Alvarado Rodriguez, Anant Gupta, Sai Sri Vastav Minnal, Valmik Nahata, Anya Ganeshan, Ahmet Gundogdu, Jason Lew

TL;DR

This work addresses efficient triangle counting in large, sparse graphs by introducing a BFS-generated cover-edge set that reduces unnecessary edge checks. The core method, CETC, and its sequential, shared-memory, and distributed-memory variants demonstrate competitive performance against state-of-the-art approaches by counting triangles using a compact edge subset and careful handling of duplicates. The authors provide an extensive open-source framework with 22 sequential and 11 parallel implementations, plus rigorous experiments on Graph500 RMAT and SNAP datasets, showing substantial speedups and dramatic communication reductions in distributed settings (e.g., CETC-DM achieves up to ~2368x lower communication on scale-42 graphs). The results highlight the method’s practicality across graph topologies and hardware, and the reproducible framework enables broader adoption and future extensions in high-performance triangle counting. Overall, CETC introduces a scalable, communication-aware paradigm that balances BFS preprocessing, edge intersections, and parallelism to advance triangle counting on modern architectures, with mathematical characterizations such as $O(m \, d_{max})$ and $O(m^{1.5})$–type behavior guiding its performance profile.$

Abstract

Listing and counting triangles in graphs is a key algorithmic kernel for network analyses, including community detection, clustering coefficients, k-trusses, and triangle centrality. In this paper, we propose the novel concept of a cover-edge set that can be used to find triangles more efficiently. Leveraging the breadth-first search (BFS) method, we can quickly generate a compact cover-edge set. Novel sequential and parallel triangle counting algorithms that employ cover-edge sets are presented. The novel sequential algorithm performs competitively with the fastest previous approaches on both real and synthetic graphs, such as those from the Graph500 Benchmark and the MIT/Amazon/IEEE Graph Challenge. We implement 22 sequential algorithms for performance evaluation and comparison. At the same time, we employ OpenMP to parallelize 11 sequential algorithms, presenting an in-depth analysis of their parallel performance. Furthermore, we develop a distributed parallel algorithm that can asymptotically reduce communication on massive graphs. In our estimate from massive-scale Graph500 graphs, our distributed parallel algorithm can reduce the communication on a scale~36 graph by 1156x and on a scale~42 graph by 2368x. Comprehensive experiments are conducted on the recently launched Intel Xeon 8480+ processor and shed light on how graph attributes, such as topology, diameter, and degree distribution, can affect the performance of these algorithms.

Cover Edge-Based Novel Triangle Counting

TL;DR

This work addresses efficient triangle counting in large, sparse graphs by introducing a BFS-generated cover-edge set that reduces unnecessary edge checks. The core method, CETC, and its sequential, shared-memory, and distributed-memory variants demonstrate competitive performance against state-of-the-art approaches by counting triangles using a compact edge subset and careful handling of duplicates. The authors provide an extensive open-source framework with 22 sequential and 11 parallel implementations, plus rigorous experiments on Graph500 RMAT and SNAP datasets, showing substantial speedups and dramatic communication reductions in distributed settings (e.g., CETC-DM achieves up to ~2368x lower communication on scale-42 graphs). The results highlight the method’s practicality across graph topologies and hardware, and the reproducible framework enables broader adoption and future extensions in high-performance triangle counting. Overall, CETC introduces a scalable, communication-aware paradigm that balances BFS preprocessing, edge intersections, and parallelism to advance triangle counting on modern architectures, with mathematical characterizations such as and –type behavior guiding its performance profile.$

Abstract

Listing and counting triangles in graphs is a key algorithmic kernel for network analyses, including community detection, clustering coefficients, k-trusses, and triangle centrality. In this paper, we propose the novel concept of a cover-edge set that can be used to find triangles more efficiently. Leveraging the breadth-first search (BFS) method, we can quickly generate a compact cover-edge set. Novel sequential and parallel triangle counting algorithms that employ cover-edge sets are presented. The novel sequential algorithm performs competitively with the fastest previous approaches on both real and synthetic graphs, such as those from the Graph500 Benchmark and the MIT/Amazon/IEEE Graph Challenge. We implement 22 sequential algorithms for performance evaluation and comparison. At the same time, we employ OpenMP to parallelize 11 sequential algorithms, presenting an in-depth analysis of their parallel performance. Furthermore, we develop a distributed parallel algorithm that can asymptotically reduce communication on massive graphs. In our estimate from massive-scale Graph500 graphs, our distributed parallel algorithm can reduce the communication on a scale~36 graph by 1156x and on a scale~42 graph by 2368x. Comprehensive experiments are conducted on the recently launched Intel Xeon 8480+ processor and shed light on how graph attributes, such as topology, diameter, and degree distribution, can affect the performance of these algorithms.
Paper Structure (38 sections, 4 theorems, 11 figures, 6 tables, 15 algorithms)

This paper contains 38 sections, 4 theorems, 11 figures, 6 tables, 15 algorithms.

Key Result

Lemma 1

Each triangle $\{u, v, w\}$ in a graph contains at least one horizontal-edge in an arbitrarily rooted BFS tree.

Figures (11)

  • Figure 1: An example to mark different edges based on a BFS spanning tree. The tree-edges are black, strut-edges are blue, and horizontal-edges are red.
  • Figure 2: The speedups of direction-oriented optimization compared with the duplicate counting counterparts.
  • Figure 3: The speedups of hash-based optimization compared with the MergePath method.
  • Figure 4: The speedups of Forward Algorithm and its variants compared with the MergePath method.
  • Figure 5: The speedups of CETC-Seq and its variants compared with the MergePath method.
  • ...and 6 more figures

Theorems & Definitions (6)

  • Definition 1: Cover-Edge, Cover-Edge Set and Covering Ratio
  • Definition 2: BFS-Edge
  • Lemma 1
  • Theorem 3: Cover-Edge Set Generation
  • Lemma 2
  • Theorem 4: Correctness