Q-DISCO: Query-Centric Densest Subgraphs in Networks with Opinion Information
Tianyi Chen, Atsushi Miyauchi, Charalampos E. Tsourakakis
TL;DR
Q-DISCO addresses the problem of finding densely connected subgraphs whose node opinions align with a given query vector, formalized as maximizing density with a lower-bound average agreement constraint. The authors prove NP-hardness and limited approximation guarantees, and propose two principled heuristics: Q-Lagrange, based on Lagrangian relaxation, and Q-Peeling, a dual LP-inspired greedy peeling method. Through extensive experiments on Twitter, DBLP, and Deezer data, the methods demonstrate strong performance in identifying meaningful, opinion-aligned dense communities and show favorable scalability compared to LP-based baselines. The work provides practical tools for analyzing opinion dynamics and cohesive substructures in opinion-rich networks, with potential applications in recommender systems and social science research.
Abstract
Given a network $G=(V,E)$, where each node $v$ is associated with a vector $\boldsymbol{p}_v \in \mathbb{R}^d$ representing its opinion about $d$ different topics, how can we uncover subsets of nodes that not only exhibit exceptionally high density but also possess positively aligned opinions on multiple topics? In this paper we focus on this novel algorithmic question, that is essential in an era where digital social networks are hotbeds of opinion formation and dissemination. We introduce a novel methodology anchored in the well-established densest subgraph problem. We analyze the computational complexity of our formulation, indicating that our problem is NP-hard and eludes practically acceptable approximation guarantees. To navigate these challenges, we design two heuristic algorithms: the first is predicated on the Lagrangian relaxation of our formulation, while the second adopts a peeling algorithm based on the dual of a Linear Programming relaxation. We elucidate the theoretical underpinnings of their performance and validate their utility through empirical evaluation on real-world datasets. Among others, we delve into Twitter datasets we collected concerning timely issues, such as the Ukraine conflict and the discourse surrounding COVID-19 mRNA vaccines, to gauge the effectiveness of our methodology. Our empirical investigations verify that our algorithms are able to extract valuable insights from networks with opinion information.
