Table of Contents
Fetching ...

SASAV: Self-Directed Agent for Scientific Analysis and Visualization

Jianxin Sun, David Lenz, Tom Peterka, Hongfeng Yu

Abstract

With recent advances in frontier multimodal large language models (MLLMs) for data understanding and visual reasoning, the role of LLMs has evolved from passive LLM-as-an-interface to proactive LLM-as-a-judge, enabling deeper integration into the scientific data analysis and visualization pipelines. However, existing scientific visualization agents still rely on domain experts to provide prior knowledge for specific datasets or visualization-oriented objective functions to guide the workflow through iterative feedback. This reactive, data-dependent, human-in-the-loop (HITL) paradigm is time-consuming and does not scale effectively to large-scale scientific data. In this work, we propose a Self-Directed Agent for Scientific Analysis and Visualization (SASAV), the first fully autonomous AI agent to perform scientific data analysis and generate insightful visualizations without any external prompting or HITL feedback. SASAV is a multi-agent system that automatically orchestrates data exploration workflows through our proposed components, including automated data profiling, context-aware knowledge retrieval, and reasoning-driven visualization parameter exploration, while supporting downstream interactive visualization tasks. This work establishes a foundational building block for the future AI for Science to accelerate scientific discovery and innovation at scale.

SASAV: Self-Directed Agent for Scientific Analysis and Visualization

Abstract

With recent advances in frontier multimodal large language models (MLLMs) for data understanding and visual reasoning, the role of LLMs has evolved from passive LLM-as-an-interface to proactive LLM-as-a-judge, enabling deeper integration into the scientific data analysis and visualization pipelines. However, existing scientific visualization agents still rely on domain experts to provide prior knowledge for specific datasets or visualization-oriented objective functions to guide the workflow through iterative feedback. This reactive, data-dependent, human-in-the-loop (HITL) paradigm is time-consuming and does not scale effectively to large-scale scientific data. In this work, we propose a Self-Directed Agent for Scientific Analysis and Visualization (SASAV), the first fully autonomous AI agent to perform scientific data analysis and generate insightful visualizations without any external prompting or HITL feedback. SASAV is a multi-agent system that automatically orchestrates data exploration workflows through our proposed components, including automated data profiling, context-aware knowledge retrieval, and reasoning-driven visualization parameter exploration, while supporting downstream interactive visualization tasks. This work establishes a foundational building block for the future AI for Science to accelerate scientific discovery and innovation at scale.

Paper Structure

This paper contains 31 sections, 3 equations, 11 figures, 3 tables.

Figures (11)

  • Figure 1: Evolution of AI for Science
  • Figure 2: Architecture of SASAV. $N$ is the number of RSV selected for initial rendering, $M$ is the number of isovalues selected for isosurface rendering, and $K$ is the number of viewpoints sampled on the view surfaces.
  • Figure 3: Object recognition agentic workflow through evaluator and recognizer. Five distinct RSVs are used to construct the opacity TFs to render results for the evaluator to judge.
  • Figure 4: Forager agentic workflow to retrieve knowledge about the regions of interest with scientific significance.
  • Figure 5: Detailed workflow of Semantic Analyzer (SA) and Transfer Function Designer (TFD) on simulated scientific data (Flame dataset). SA conducts parallel range-of-interest perception for each isovalue, and TFD aggregates the results to generate semantic color and opacity mappings.
  • ...and 6 more figures