A Clique Partitioning-Based Algorithm for Graph Compression
Akshar Chavan, Sanaz Rabinia, Daniel Grosu, Marco Brocanelli
TL;DR
The paper tackles speeding up path-dependent graph algorithms on large graphs by introducing CPGC, a lossless graph compression method for bipartite graphs that preserves reachability while substantially reducing edges. CPGC improves upon Feder-Motwani's clique-partitioning approach by using degree-based vertex selection and enabling multiple delta-cliques per iteration, achieving an overall running time of $O(mn^{\delta})$ and a compression bound $|E^*| = O(m/k)$. Empirical results show CPGC delivers up to 26% greater compression and up to 105.18x faster preprocessing on large dense graphs, with subsequent speedups in matching algorithms reaching 72.83% when using the compressed graph. The approach thus provides a scalable, path-preserving graph compression framework that accelerates downstream graph tasks such as all-pairs shortest paths and matching, while maintaining exact path information. It also includes extensions to non-bipartite graphs and practical guidance through appendices detailing FM, examples, proofs, and non-bipartite transformations.
Abstract
Reducing the running time of graph algorithms is vital for tackling real-world problems such as shortest paths and matching in large-scale graphs, where path information plays a crucial role. This paper addresses this critical challenge of reducing the running time of graph algorithms by proposing a new graph compression algorithm that partitions the graph into bipartite cliques and uses the partition to obtain a compressed graph having a smaller number of edges while preserving the path information. This compressed graph can then be used as input to other graph algorithms for which path information is essential, leading to a significant reduction of their running time, especially for large, dense graphs. The running time of the proposed algorithm is $O(mn^δ)$, where $0 \leq δ\leq 1$, which is better than $O(mn^δ\log^2 n)$, the running time of the best existing clique partitioning-based graph compression algorithm (the Feder-Motwani (\textsf{FM}) algorithm). Our extensive experimental analysis show that our algorithm achieves a compression ratio of up to $26\%$ greater and executes up to 105.18 times faster than the \textsf{FM} algorithm. In addition, on large graphs with up to 1.05 billion edges, it achieves a compression ratio of up to 3.9, reducing the number of edges up to $74.36\%$. Finally, our tests with a matching algorithm on sufficiently large, dense graphs, demonstrate a reduction in the running time of up to 72.83\% when the input is the compressed graph obtained by our algorithm, compared to the case where the input is the original uncompressed graph.
