Triangle Centrality
Paul Burkhardt
TL;DR
Triangle centrality introduces a new graph centrality that quantifies vertex importance by the concentration of triangles around a vertex, allowing centrality for nodes that are either heavily triangle-centered or largely non-triangle-centered. It provides a precise linear-algebraic formulation with the triangle matrix $T=A^2\circ A$ and a simple normalization $\mathbf{r}_{TC}=\frac{\left(3A-2\check{T}+I\right)T\mathbf{1}}{\mathbf{1}^T T \mathbf{1}}$, enabling optimal $O(m\bar{\delta})$-time computation and near-linear performance on sparse graphs. The paper develops two algorithmic paths (combinatorial and algebraic) and parallel implementations (CREW PRAM in $O(\log n)$ time with $O(m\sqrt{m})$ processors; MapReduce in 4 rounds with $O(m\sqrt{m})$ communication), and demonstrates broad applicability with real networks: in about 30% of cases TC identifies central vertices different from traditional measures, while typically aligning with existing centralities. A linear algebraic route and a suite of performance results on SNAP data validate both the theoretical efficiency and practical utility of triangle centrality for large-scale graph analysis. This promises a robust, scalable, and novel perspective on node importance that complements degree-based and path-based centralities.
Abstract
Triangle centrality is introduced for finding important vertices in a graph based on the concentration of triangles surrounding each vertex. It has the distinct feature of allowing a vertex to be central if it is in many triangles or none at all. We show experimentally that triangle centrality is broadly applicable to many different types of networks. Our empirical results demonstrate that 30% of the time triangle centrality identified central vertices that differed with those found by five well-known centrality measures, which suggests novelty without being overly specialized. It is also asymptotically faster to compute on sparse graphs than all but the most trivial of these other measures. We introduce optimal algorithms that compute triangle centrality in $O(m\barδ)$ time and $O(m+n)$ space, where $\barδ\le O(\sqrt{m})$ is the $\textit{average degeneracy}$ introduced by Burkhardt, Faber, and Harris (2020). In practical applications, $\barδ$ is much smaller than $\sqrt{m}$ so triangle centrality can be computed in nearly linear time. On a Concurrent Read Exclusive Write (CREW) Parallel Random Access Machine (PRAM), we give a near work-optimal parallel algorithm that takes $O(\log n)$ time using $O(m\sqrt{m})$ CREW PRAM processors. In MapReduce, we show it takes four rounds using $O(m\sqrt{m})$ communication bits and is therefore optimal. We also derive a linear algebraic formulation of triangle centrality which can be computed in $O(m\barδ)$ time on sparse graphs.
