Table of Contents
Fetching ...

Can LLMs Simulate Social Media Engagement? A Study on Action-Guided Response Generation

Zhongyi Qiu, Hanjia Lyu, Wei Xiong, Jiebo Luo

TL;DR

The paper investigates whether LLMs can realistically simulate social-media engagement by first predicting a user's engagement action (retweet, quote, rewrite) and then generating a personalized response conditioned on that action. It introduces an action-guided response generation framework and benchmarks multiple LLMs against BERT baselines on a COVID-19 vaccination discourse dataset from X, evaluating both action prediction and semantics/style alignment of generated posts. Key findings show zero-shot LLMs underperform in action prediction compared to fine-tuned BERT, while few-shot prompts can degrade action accuracy yet improve semantic alignment in responses, highlighting a trade-off between classification performance and generative fidelity. The work advances understanding of how well LLMs capture engagement dynamics and informs design choices for controlled, human-aligned social-media simulations with implications for ethics and deployment in research.

Abstract

Social media enables dynamic user engagement with trending topics, and recent research has explored the potential of large language models (LLMs) for response generation. While some studies investigate LLMs as agents for simulating user behavior on social media, their focus remains on practical viability and scalability rather than a deeper understanding of how well LLM aligns with human behavior. This paper analyzes LLMs' ability to simulate social media engagement through action guided response generation, where a model first predicts a user's most likely engagement action-retweet, quote, or rewrite-towards a trending post before generating a personalized response conditioned on the predicted action. We benchmark GPT-4o-mini, O1-mini, and DeepSeek-R1 in social media engagement simulation regarding a major societal event discussed on X. Our findings reveal that zero-shot LLMs underperform BERT in action prediction, while few-shot prompting initially degrades the prediction accuracy of LLMs with limited examples. However, in response generation, few-shot LLMs achieve stronger semantic alignment with ground truth posts.

Can LLMs Simulate Social Media Engagement? A Study on Action-Guided Response Generation

TL;DR

The paper investigates whether LLMs can realistically simulate social-media engagement by first predicting a user's engagement action (retweet, quote, rewrite) and then generating a personalized response conditioned on that action. It introduces an action-guided response generation framework and benchmarks multiple LLMs against BERT baselines on a COVID-19 vaccination discourse dataset from X, evaluating both action prediction and semantics/style alignment of generated posts. Key findings show zero-shot LLMs underperform in action prediction compared to fine-tuned BERT, while few-shot prompts can degrade action accuracy yet improve semantic alignment in responses, highlighting a trade-off between classification performance and generative fidelity. The work advances understanding of how well LLMs capture engagement dynamics and informs design choices for controlled, human-aligned social-media simulations with implications for ethics and deployment in research.

Abstract

Social media enables dynamic user engagement with trending topics, and recent research has explored the potential of large language models (LLMs) for response generation. While some studies investigate LLMs as agents for simulating user behavior on social media, their focus remains on practical viability and scalability rather than a deeper understanding of how well LLM aligns with human behavior. This paper analyzes LLMs' ability to simulate social media engagement through action guided response generation, where a model first predicts a user's most likely engagement action-retweet, quote, or rewrite-towards a trending post before generating a personalized response conditioned on the predicted action. We benchmark GPT-4o-mini, O1-mini, and DeepSeek-R1 in social media engagement simulation regarding a major societal event discussed on X. Our findings reveal that zero-shot LLMs underperform BERT in action prediction, while few-shot prompting initially degrades the prediction accuracy of LLMs with limited examples. However, in response generation, few-shot LLMs achieve stronger semantic alignment with ground truth posts.

Paper Structure

This paper contains 17 sections, 4 figures, 3 tables, 1 algorithm.

Figures (4)

  • Figure 1: An illustration of action-guided response generation: The model first predicts the engagement action in response to the trending post. The post is then generated based on the predicted action.
  • Figure 2: Statistics of quote, rewrite, and retweet actions predicted by different models. The models are grouped into Gpt-4o-mini, O1-mini, DeepSeek, and BERT, with three settings: ZS_base, ZS_info, and FS_hist. Ground truth is also included as a reference.
  • Figure 3: Model accuracy for action prediction across different few-shot settings.
  • Figure 4: Effect of action order in the prompt on LLM predictions.