Towards Effective, Efficient and Unsupervised Social Event Detection in the Hyperbolic Space
Xiaoyan Yu, Yifan Wei, Shuaishuai Zhou, Zhiwei Yang, Li Sun, Hao Peng, Liehuang Zhu, Philip S. Yu
TL;DR
This work tackles unsupervised social event detection (SED) on large, dynamic social media by introducing HyperSED, which represents messages as semantic anchors and learns their interrelations in hyperbolic space to produce compact, structure-aware representations. A differentiable structural-information framework guides the bottom-up construction of a partitioning tree that reveals detected events, while maintaining efficiency on large-scale data. HyperSED introduces a semantic anchor-based SAMG to reduce data size without losing critical interrelations, and employs a hyperbolic graph autoencoder to capture both structure and geometry of anchors. Across English and French Twitter datasets, HyperSED achieves competitive accuracy and substantial speedups over prior unsupervised approaches, highlighting practical implications for real-time event detection and monitoring.
Abstract
The vast, complex, and dynamic nature of social message data has posed challenges to social event detection (SED). Despite considerable effort, these challenges persist, often resulting in inadequately expressive message representations (ineffective) and prolonged learning durations (inefficient). In response to the challenges, this work introduces an unsupervised framework, HyperSED (Hyperbolic SED). Specifically, the proposed framework first models social messages into semantic-based message anchors, and then leverages the structure of the anchor graph and the expressiveness of the hyperbolic space to acquire structure- and geometry-aware anchor representations. Finally, HyperSED builds the partitioning tree of the anchor message graph by incorporating differentiable structural information as the reflection of the detected events. Extensive experiments on public datasets demonstrate HyperSED's competitive performance, along with a substantial improvement in efficiency compared to the current state-of-the-art unsupervised paradigm. Statistically, HyperSED boosts incremental SED by an average of 2%, 2%, and 25% in NMI, AMI, and ARI, respectively; enhancing efficiency by up to 37.41 times and at least 12.10 times, illustrating the advancement of the proposed framework. Our code is publicly available at https://github.com/XiaoyanWork/HyperSED.
