Table of Contents
Fetching ...

Heterogeneous Social Event Detection via Hyperbolic Graph Representations

Zitai Qiu, Jia Wu, Jian Yang, Xing Su, Charu C. Aggarwal

TL;DR

This work tackles social event detection in heterogeneous social networks characterized by short texts, multi-type content, and limited labels. It introduces two hyperbolic-graph approaches: HSED, a supervised model that converts heterogeneous data into a unified homogeneous message graph and learns hyperbolic embeddings for node classification, and UHSED, an unsupervised variant that uses graph contrastive learning with a hyperbolic GCN encoder. The main contributions include the first application of hyperbolic space to heterogeneous social event detection, a data-unification pipeline via Word2Vec, and empirical evidence that hyperbolic representations better capture tree-like social data, outperforming Euclidean baselines in both supervised and unsupervised settings. These findings have practical implications for timely event detection in large-scale, hierarchical social networks and point toward future work on dynamic detection and label-scarce scenarios.

Abstract

Social events reflect the dynamics of society and, here, natural disasters and emergencies receive significant attention. The timely detection of these events can provide organisations and individuals with valuable information to reduce or avoid losses. However, due to the complex heterogeneities of the content and structure of social media, existing models can only learn limited information; large amounts of semantic and structural information are ignored. In addition, due to high labour costs, it is rare for social media datasets to include high-quality labels, which also makes it challenging for models to learn information from social media. In this study, we propose two hyperbolic graph representation-based methods for detecting social events from heterogeneous social media environments. For cases where a dataset has labels, we designed a Hyperbolic Social Event Detection (HSED) model that converts complex social information into a unified social message graph. This model addresses the heterogeneity of social media, and, with this graph, the information in social media can be used to capture structural information based on the properties of hyperbolic space. For cases where the dataset is unlabelled, we designed an Unsupervised Hyperbolic Social Event Detection (UHSED). This model is based on the HSED model but includes graph contrastive learning to make it work in unlabelled scenarios. Extensive experiments demonstrate the superiority of the proposed approaches.

Heterogeneous Social Event Detection via Hyperbolic Graph Representations

TL;DR

This work tackles social event detection in heterogeneous social networks characterized by short texts, multi-type content, and limited labels. It introduces two hyperbolic-graph approaches: HSED, a supervised model that converts heterogeneous data into a unified homogeneous message graph and learns hyperbolic embeddings for node classification, and UHSED, an unsupervised variant that uses graph contrastive learning with a hyperbolic GCN encoder. The main contributions include the first application of hyperbolic space to heterogeneous social event detection, a data-unification pipeline via Word2Vec, and empirical evidence that hyperbolic representations better capture tree-like social data, outperforming Euclidean baselines in both supervised and unsupervised settings. These findings have practical implications for timely event detection in large-scale, hierarchical social networks and point toward future work on dynamic detection and label-scarce scenarios.

Abstract

Social events reflect the dynamics of society and, here, natural disasters and emergencies receive significant attention. The timely detection of these events can provide organisations and individuals with valuable information to reduce or avoid losses. However, due to the complex heterogeneities of the content and structure of social media, existing models can only learn limited information; large amounts of semantic and structural information are ignored. In addition, due to high labour costs, it is rare for social media datasets to include high-quality labels, which also makes it challenging for models to learn information from social media. In this study, we propose two hyperbolic graph representation-based methods for detecting social events from heterogeneous social media environments. For cases where a dataset has labels, we designed a Hyperbolic Social Event Detection (HSED) model that converts complex social information into a unified social message graph. This model addresses the heterogeneity of social media, and, with this graph, the information in social media can be used to capture structural information based on the properties of hyperbolic space. For cases where the dataset is unlabelled, we designed an Unsupervised Hyperbolic Social Event Detection (UHSED). This model is based on the HSED model but includes graph contrastive learning to make it work in unlabelled scenarios. Extensive experiments demonstrate the superiority of the proposed approaches.
Paper Structure (46 sections, 35 equations, 11 figures, 10 tables, 2 algorithms)

This paper contains 46 sections, 35 equations, 11 figures, 10 tables, 2 algorithms.

Figures (11)

  • Figure 1: Within a tree-like structure data, the distance between node $A$ and node $B$ is difficult to calculate in Euclidean space, but the distance between node $A'$ and node $B'$ is easy to calculate. Here, $A'$ and $B'$ are the projections of nodes A and B on hyperbolic space.
  • Figure 2: An illustrative example of a heterogeneous message graph: (a) A social message network containing three types of notes (e.g., user, message, and words) and four types of links (e.g., post, contain, mention, retweet). (b) The network schema of a social message network. (c) An example of a meta-path in a social message network. (d) An example of a meta-structure.
  • Figure 3: An example of tree-like structure data. (a) An example of tree-structured data expanding exponentially in Euclidean space. (b) Comparison of the distance between two nodes in hyperbolic space (blue) and Euclidean space (black).
  • Figure 4: The framework of the HSED model. “HMG” denotes a homogeneous message graph. “H” denotes hyperbolic embeddings after the hyperbolic encoder. “Z” denotes the final representations.
  • Figure 5: Social media data processing: (a) The original heterogeneous message network generated by the raw message. The different node colours denote different entities. (b) The process of learning message features via Word2vec. (c) The homogeneous message network generated from Steps (a) and (b).
  • ...and 6 more figures