Quasi-Clique Discovery via Energy Diffusion
Yu Zhang, Yilong Luo, Mingyuan Ma, Yao Chen, Enqiang Zhu, Jin Xu, Chanjuan Liu
TL;DR
Quasi-clique discovery in large graphs is NP-hard and prone to seed sensitivity. This work introduces EDQC, which combines an adaptive energy-diffusion phase that concentrates mass in cohesive regions with a conductance-guided extraction and refinement to guarantee a $\\\gamma$-quasi-clique. Empirical results on 75 real-world graphs show EDQC consistently yields larger quasi-cliques with low variance and competitive runtimes compared to state-of-the-art baselines, with statistical evidence of its superiority. The approach offers a robust, density-controlled alternative for dense subgraph discovery in large-scale networks and has potential applications in fraud detection, web spam, and recommendations.
Abstract
Discovering quasi-cliques -- subgraphs whose edge density exceeds a given threshold -- is a fundamental task in graph mining with applications to web spam detection, fraud screening, and e-commerce recommendation. However, existing methods for quasi-clique discovery on large-scale web graphs are often sensitive to random seeds or lack of explicit edge-density guarantees, making the task challenging in practice. This paper presents EDQC, an energy diffusion-based method for quasi-clique discovery. EDQC first employs an adaptive energy diffusion process to generate an energy ranking that highlights structurally cohesive regions. Guided by this energy ranking, the algorithm identifies a high-quality subgraph by minimizing conductance, a standard measure from community detection. This subgraph is then refined to meet the specified density threshold. Extensive experiments on 75 real-world graphs show that EDQC finds larger quasi-cliques on most datasets, with consistently lower variance across runs and competitive runtime. To the best of our knowledge, EDQC is the first method to incorporate energy diffusion into quasi-clique discovery.
