Table of Contents
Fetching ...

Linking Heterogeneous Data with Coordinated Agent Flows for Social Media Analysis

Shifu Chen, Dazhen Deng, Zhihong Xu, Sijia Xu, Tai-Quan Peng, Yingcai Wu

TL;DR

The paper tackles the challenge of analyzing heterogeneous social media data by introducing Social Insight Agents (SIA), an LLM-based agent system that links multi-modal inputs through coordinated agent flows. It develops a bottom-up taxonomy of social media insights and a data coordinator to unify tabular, textual, and network data, supported by an interactive interface for transparent, traceable reasoning. Key contributions include the taxonomy guiding method–visualization choices, the coordinated agent framework with a heterogeneity coordinator, and a human–AI collaborative workflow demonstrated through expert case studies and quantitative evaluation. The work advances practical cross-modal social media analysis by enabling adaptive, explainable insight discovery and robust human–agent collaboration in complex analytical tasks, with potential for broader applicability and extension.

Abstract

Social media platforms generate massive volumes of heterogeneous data, capturing user behaviors, textual content, temporal dynamics, and network structures. Analyzing such data is crucial for understanding phenomena such as opinion dynamics, community formation, and information diffusion. However, discovering insights from this complex landscape is exploratory, conceptually challenging, and requires expertise in social media mining and visualization. Existing automated approaches, though increasingly leveraging large language models (LLMs), remain largely confined to structured tabular data and cannot adequately address the heterogeneity of social media analysis. We present SIA (Social Insight Agents), an LLM agent system that links heterogeneous multi-modal data -- including raw inputs (e.g., text, network, and behavioral data), intermediate outputs, mined analytical results, and visualization artifacts -- through coordinated agent flows. Guided by a bottom-up taxonomy that connects insight types with suitable mining and visualization techniques, SIA enables agents to plan and execute coherent analysis strategies. To ensure multi-modal integration, it incorporates a data coordinator that unifies tabular, textual, and network data into a consistent flow. Its interactive interface provides a transparent workflow where users can trace, validate, and refine the agent's reasoning, supporting both adaptability and trustworthiness. Through expert-centered case studies and quantitative evaluation, we show that SIA effectively discovers diverse and meaningful insights from social media while supporting human-agent collaboration in complex analytical tasks.

Linking Heterogeneous Data with Coordinated Agent Flows for Social Media Analysis

TL;DR

The paper tackles the challenge of analyzing heterogeneous social media data by introducing Social Insight Agents (SIA), an LLM-based agent system that links multi-modal inputs through coordinated agent flows. It develops a bottom-up taxonomy of social media insights and a data coordinator to unify tabular, textual, and network data, supported by an interactive interface for transparent, traceable reasoning. Key contributions include the taxonomy guiding method–visualization choices, the coordinated agent framework with a heterogeneity coordinator, and a human–AI collaborative workflow demonstrated through expert case studies and quantitative evaluation. The work advances practical cross-modal social media analysis by enabling adaptive, explainable insight discovery and robust human–agent collaboration in complex analytical tasks, with potential for broader applicability and extension.

Abstract

Social media platforms generate massive volumes of heterogeneous data, capturing user behaviors, textual content, temporal dynamics, and network structures. Analyzing such data is crucial for understanding phenomena such as opinion dynamics, community formation, and information diffusion. However, discovering insights from this complex landscape is exploratory, conceptually challenging, and requires expertise in social media mining and visualization. Existing automated approaches, though increasingly leveraging large language models (LLMs), remain largely confined to structured tabular data and cannot adequately address the heterogeneity of social media analysis. We present SIA (Social Insight Agents), an LLM agent system that links heterogeneous multi-modal data -- including raw inputs (e.g., text, network, and behavioral data), intermediate outputs, mined analytical results, and visualization artifacts -- through coordinated agent flows. Guided by a bottom-up taxonomy that connects insight types with suitable mining and visualization techniques, SIA enables agents to plan and execute coherent analysis strategies. To ensure multi-modal integration, it incorporates a data coordinator that unifies tabular, textual, and network data into a consistent flow. Its interactive interface provides a transparent workflow where users can trace, validate, and refine the agent's reasoning, supporting both adaptability and trustworthiness. Through expert-centered case studies and quantitative evaluation, we show that SIA effectively discovers diverse and meaningful insights from social media while supporting human-agent collaboration in complex analytical tasks.

Paper Structure

This paper contains 35 sections, 14 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Overview of SIA. The planner decomposes user goals into actionable steps and coordinates the invocation of query, mining, and visualization agents. The heterogeneity coordinator ensures smooth data flow by adapting formats across agents. The entire workflow is presented transparently, allowing users to review, validate, and refine both reasoning and outcomes.
  • Figure 2: Overview of the planner. It decomposes a user goal into exploration directions, invokes query, mining, and visualization agents with API guidance, and selects effective results for the final insight report. Control signals include Navigate (green arrow), Terminate (yellow cross), Terminate (grey icon).
  • Figure 3: Role of the heterogeneity coordinator. This component manages data heterogeneity across agents by transforming outputs into required input formats and linking entities through shared identifiers. It ensures that query, mining, and visualization agents can operate seamlessly despite differences in data structure and modality.
  • Figure 4: System interface. Chat Panel (C) facilitates dialogue between users and the agent. Action View (B) selectively displays the agent's actions during insight discovery with expandable details. Mining Result View (D) visualizes relationships between hyperparameters and corresponding results and allows users to add configurations within a Miner node. Report View (C) shows the final conclusions of the discovery.
  • Figure 5: Temporal Analysis of COVID-19 social media discussions. The line chart shows weekly post volume of three distinct phases. Three wordcloud group revealing evolving topics across three phases. The coordinated interactions between these visualizations are established through linkages maintained in visualization coordinator.
  • ...and 1 more figures