Table of Contents
Fetching ...

Adaptive Differentially Private Structural Entropy Minimization for Unsupervised Social Event Detection

Zhiwei Yang, Yuecen Wei, Haoran Li, Qian Li, Lei Jiang, Li Sun, Xiaoyan Yu, Chunming Hu, Hao Peng

TL;DR

This work tackles unsupervised social event detection under formal privacy guarantees by introducing ADP-SEMEvent, a two-stage framework that first builds a private message graph via an adaptive differential privacy strategy and then detects events through a novel 2D Structural Entropy minimization on optimal subgraphs. The mixed-sensitivity approach automatically adjusts noise based on daily event dynamics, preserving privacy while maintaining detection utility. Empirical results on two public Twitter datasets show competitive performance against state-of-the-art baselines in open settings and robust privacy protection against attribute-inference attacks, with clear insights into how the privacy budget $\epsilon$ governs noise and accuracy. The approach offers a practical, unsupervised, privacy-preserving solution for real-world social event detection with open-world applicability and robust privacy guarantees.

Abstract

Social event detection refers to extracting relevant message clusters from social media data streams to represent specific events in the real world. Social event detection is important in numerous areas, such as opinion analysis, social safety, and decision-making. Most current methods are supervised and require access to large amounts of data. These methods need prior knowledge of the events and carry a high risk of leaking sensitive information in the messages, making them less applicable in open-world settings. Therefore, conducting unsupervised detection while fully utilizing the rich information in the messages and protecting data privacy remains a significant challenge. To this end, we propose a novel social event detection framework, ADP-SEMEvent, an unsupervised social event detection method that prioritizes privacy. Specifically, ADP-SEMEvent is divided into two stages, i.e., the construction stage of the private message graph and the clustering stage of the private message graph. In the first stage, an adaptive differential privacy approach is used to construct a private message graph. In this process, our method can adaptively apply differential privacy based on the events occurring each day in an open environment to maximize the use of the privacy budget. In the second stage, to address the reduction in data utility caused by noise, a novel 2-dimensional structural entropy minimization algorithm based on optimal subgraphs is used to detect events in the message graph. The highlight of this process is unsupervised and does not compromise differential privacy. Extensive experiments on two public datasets demonstrate that ADP-SEMEvent can achieve detection performance comparable to state-of-the-art methods while maintaining reasonable privacy budget parameters.

Adaptive Differentially Private Structural Entropy Minimization for Unsupervised Social Event Detection

TL;DR

This work tackles unsupervised social event detection under formal privacy guarantees by introducing ADP-SEMEvent, a two-stage framework that first builds a private message graph via an adaptive differential privacy strategy and then detects events through a novel 2D Structural Entropy minimization on optimal subgraphs. The mixed-sensitivity approach automatically adjusts noise based on daily event dynamics, preserving privacy while maintaining detection utility. Empirical results on two public Twitter datasets show competitive performance against state-of-the-art baselines in open settings and robust privacy protection against attribute-inference attacks, with clear insights into how the privacy budget governs noise and accuracy. The approach offers a practical, unsupervised, privacy-preserving solution for real-world social event detection with open-world applicability and robust privacy guarantees.

Abstract

Social event detection refers to extracting relevant message clusters from social media data streams to represent specific events in the real world. Social event detection is important in numerous areas, such as opinion analysis, social safety, and decision-making. Most current methods are supervised and require access to large amounts of data. These methods need prior knowledge of the events and carry a high risk of leaking sensitive information in the messages, making them less applicable in open-world settings. Therefore, conducting unsupervised detection while fully utilizing the rich information in the messages and protecting data privacy remains a significant challenge. To this end, we propose a novel social event detection framework, ADP-SEMEvent, an unsupervised social event detection method that prioritizes privacy. Specifically, ADP-SEMEvent is divided into two stages, i.e., the construction stage of the private message graph and the clustering stage of the private message graph. In the first stage, an adaptive differential privacy approach is used to construct a private message graph. In this process, our method can adaptively apply differential privacy based on the events occurring each day in an open environment to maximize the use of the privacy budget. In the second stage, to address the reduction in data utility caused by noise, a novel 2-dimensional structural entropy minimization algorithm based on optimal subgraphs is used to detect events in the message graph. The highlight of this process is unsupervised and does not compromise differential privacy. Extensive experiments on two public datasets demonstrate that ADP-SEMEvent can achieve detection performance comparable to state-of-the-art methods while maintaining reasonable privacy budget parameters.
Paper Structure (22 sections, 8 equations, 6 figures, 4 tables, 2 algorithms)

This paper contains 22 sections, 8 equations, 6 figures, 4 tables, 2 algorithms.

Figures (6)

  • Figure 1: A toy example of an attack demonstrating the potential leakage of sensitive information such as persons or geopolitical entities.
  • Figure 2: The proposed ADP-SEMEvent framework. ADP-SEMEvent consists of two stages: the private message graph construction stage (stage 1) and the private message graph clustering stage (stage 2). Messages with the same color represent the same cluster according to the ground truth labels; orange edges $E_s$ are derived from 1-dimensional structural entropy, and purple edges $E_a$ are derived from relevant attributes; arrows of specific colors indicate specific operations.
  • Figure 3: The impact of different ways of initializing partitions on the results.
  • Figure 4: Sensitivity analysis results of the model to privacy budget $\epsilon$.
  • Figure 5: Validity verification results of Optimized Hierarchical 2D SE Minimization. ($\epsilon$ = None)
  • ...and 1 more figures

Theorems & Definitions (6)

  • definition 1
  • definition 2
  • definition 3
  • definition 4
  • definition 5
  • definition 6