AutoCas: Autoregressive Cascade Predictor in Social Networks via Large Language Models

Yuhao Zheng; Chenghua Gong; Rui Sun; Juyuan Zhang; Liming Pan; Linyuan Lv

AutoCas: Autoregressive Cascade Predictor in Social Networks via Large Language Models

Yuhao Zheng, Chenghua Gong, Rui Sun, Juyuan Zhang, Liming Pan, Linyuan Lv

TL;DR

AutoCas addresses cascade popularity prediction under diverse diffusion dynamics and limited data by repurposing decoder-only LLMs as cascade predictors. It tokenizes cascades into sequences, reformulates diffusion as an autoregressive task, and uses cascade prompts to adapt the model while freezing the LLM and training lightweight projection layers. Across Weibo, Twitter, and APS, AutoCas achieves state-of-the-art MSLE and MAPE, with clear gains as LLM size increases, demonstrating scalable performance. The approach enables flexible inference at arbitrary observation times without retraining, offering practical benefits for viral marketing, misinformation control, and recommender systems.

Abstract

Popularity prediction in information cascades plays a crucial role in social computing, with broad applications in viral marketing, misinformation control, and content recommendation. However, information propagation mechanisms, user behavior, and temporal activity patterns exhibit significant diversity, necessitating a foundational model capable of adapting to such variations. At the same time, the amount of available cascade data remains relatively limited compared to the vast datasets used for training large language models (LLMs). Recent studies have demonstrated the feasibility of leveraging LLMs for time-series prediction by exploiting commonalities across different time-series domains. Building on this insight, we introduce the Autoregressive Information Cascade Predictor (AutoCas), an LLM-enhanced model designed specifically for cascade popularity prediction. Unlike natural language sequences, cascade data is characterized by complex local topologies, diffusion contexts, and evolving dynamics, requiring specialized adaptations for effective LLM integration. To address these challenges, we first tokenize cascade data to align it with sequence modeling principles. Next, we reformulate cascade diffusion as an autoregressive modeling task to fully harness the architectural strengths of LLMs. Beyond conventional approaches, we further introduce prompt learning to enhance the synergy between LLMs and cascade prediction. Extensive experiments demonstrate that AutoCas significantly outperforms baseline models in cascade popularity prediction while exhibiting scaling behavior inherited from LLMs. Code is available at this repository: https://anonymous.4open.science/r/AutoCas-85C6

AutoCas: Autoregressive Cascade Predictor in Social Networks via Large Language Models

TL;DR

Abstract

AutoCas: Autoregressive Cascade Predictor in Social Networks via Large Language Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)