Table of Contents
Fetching ...

Large language models for spreading dynamics in complex systems

Shuyu Jiang, Hao Ren, Yichang Gao, Yi-Cheng Zhang, Li Qi, Dayong Xiao, Jie Fan, Rui Tang, Wei Wang

TL;DR

The paper surveys how large language models reshape spreading dynamics across digital and biological domains by acting as both analytical tools and active agents. It categorizes approaches into modeling, perception, and management, detailing digital-epidemic and biological-epidemic applications, from LLM-based agent modeling to multimodal data integration and intervention design. The review highlights frameworks such as FPS, GAG, CARE, Diffusion-LLM, and PandemicLLM, and discusses opportunities and challenges in scalability, robustness, and interpretability. Overall, LLMs offer new pathways to simulate, detect, forecast, and govern propagation processes, with implications for misinformation control and infectious disease response.

Abstract

Spreading dynamics is a central topic in the physics of complex systems and network science, providing a unified framework for understanding how information, behaviors, and diseases propagate through interactions among system units. In many propagation contexts, spreading processes are influenced by multiple interacting factors, such as information expression patterns, cultural contexts, living environments, cognitive preferences, and public policies, which are difficult to incorporate directly into classical modeling frameworks. Recently, large language models (LLMs) have exhibited strong capabilities in natural language understanding, reasoning, and generation, enabling explicit perception of semantic content and contextual cues in spreading processes, thereby supporting the analysis of the different influencing factors. Beyond serving as external analytical tools, LLMs can also act as interactive agents embedded in propagation systems, potentially influencing spreading pathways and feedback structures. Consequently, the roles and impacts of LLMs on spreading dynamics have become an active and rapidly growing research area across multiple research disciplines. This review provides a comprehensive overview of recent advances in applying LLMs to the study of spreading dynamics across two representative domains: digital epidemics, such as misinformation and rumors, and biological epidemics, including infectious disease outbreaks. We first examine the foundations of epidemic modeling from a complex-systems perspective and discuss how LLM-based approaches relate to traditional frameworks. We then systematically review recent studies from three key perspectives, which are epidemic modeling, epidemic detection and surveillance, and epidemic prediction and management, to clarify how LLMs enhance these areas. Finally, open challenges and potential research directions are discussed.

Large language models for spreading dynamics in complex systems

TL;DR

The paper surveys how large language models reshape spreading dynamics across digital and biological domains by acting as both analytical tools and active agents. It categorizes approaches into modeling, perception, and management, detailing digital-epidemic and biological-epidemic applications, from LLM-based agent modeling to multimodal data integration and intervention design. The review highlights frameworks such as FPS, GAG, CARE, Diffusion-LLM, and PandemicLLM, and discusses opportunities and challenges in scalability, robustness, and interpretability. Overall, LLMs offer new pathways to simulate, detect, forecast, and govern propagation processes, with implications for misinformation control and infectious disease response.

Abstract

Spreading dynamics is a central topic in the physics of complex systems and network science, providing a unified framework for understanding how information, behaviors, and diseases propagate through interactions among system units. In many propagation contexts, spreading processes are influenced by multiple interacting factors, such as information expression patterns, cultural contexts, living environments, cognitive preferences, and public policies, which are difficult to incorporate directly into classical modeling frameworks. Recently, large language models (LLMs) have exhibited strong capabilities in natural language understanding, reasoning, and generation, enabling explicit perception of semantic content and contextual cues in spreading processes, thereby supporting the analysis of the different influencing factors. Beyond serving as external analytical tools, LLMs can also act as interactive agents embedded in propagation systems, potentially influencing spreading pathways and feedback structures. Consequently, the roles and impacts of LLMs on spreading dynamics have become an active and rapidly growing research area across multiple research disciplines. This review provides a comprehensive overview of recent advances in applying LLMs to the study of spreading dynamics across two representative domains: digital epidemics, such as misinformation and rumors, and biological epidemics, including infectious disease outbreaks. We first examine the foundations of epidemic modeling from a complex-systems perspective and discuss how LLM-based approaches relate to traditional frameworks. We then systematically review recent studies from three key perspectives, which are epidemic modeling, epidemic detection and surveillance, and epidemic prediction and management, to clarify how LLMs enhance these areas. Finally, open challenges and potential research directions are discussed.
Paper Structure (32 sections, 7 equations, 21 figures, 5 tables)

This paper contains 32 sections, 7 equations, 21 figures, 5 tables.

Figures (21)

  • Figure 1: Overview of the paper structure, illustrating the Spreading dynamics spreading analysis that integrates LLMs. On the left, biological epidemics and digital epidemics are shown as two representative classes of spreading processes. In the center, heterogeneous data sources, including textual, network, and multimodal signals, provide observations of spreading processes. LLMs are depicted as a core component that interacts with these data, acting both as analytical tools for processing and reasoning over spreading information and as active entities that may participate in information generation and dissemination. On the right, the figure organizes LLM-enhanced applications across key stages of spreading workflows, including modeling, detection, surveillance, prediction, and governance. The bottom part summarizes representative challenges and risks associated with LLM-enhanced spreading systems.
  • Figure 4: An illustration of how to enable LLMs to handle various complex tasks. (1) Pre-training: Use massive datasets for self-supervised learning. This is to teach the model to master language rules and knowledge, and form a basic model. (2) Instruction Tuning: Use instruction data for supervised training. This is to teach the model to understand and follow human instructions during conversations. (3) Reinforcement Learning: Use preference data for further optimization to ensure that the output is useful, safe, and as expected.
  • Figure 5: An illustration of LLM workflow. (1) Data embedding: Various types of input data, including text, images, networks, and time series, are transformed into high-dimensional vectors known as embedding vectors, enabling their processing by the LLM. (2) Large Language Model: The LLM processes the embedding vectors to predict the next token, using the neural networks with Transformer architectures like encoder-decoder, causal decoder, or prefix decoder. (3) Task output formulation: The final step involves customizing the model's output format for different tasks, ensuring that the LLM is adapted to a wide range of applications, such as epidemic modeling, monitoring, detection, prediction, and management.
  • Figure 6: Comparison of classical epidemic modeling and LLM-based epidemic modeling. In classical agent-based models, each agent is characterized by a predefined set of attributes, such as influence, susceptibility, and network neighbors, and state transitions are computed through explicitly specified rules or equations based on these parameters and local interactions. In contrast, LLM-based generative agents are modeled as cognitively driven individuals whose state transitions emerge from internal reasoning processes and contextual perception. Rather than following fixed transition equations, these agents interpret environmental cues and generate context-dependent behaviors, allowing transmission dynamics and individual responses to adapt flexibly to cognitive, social, and environmental factors.
  • Figure 7: An example of opinion reasoning prompt templates for LLM-based agents.
  • ...and 16 more figures