Table of Contents
Fetching ...

AISSISTANT: Human-AI Collaborative Review and Perspective Research Workflows in Data Science

Sasi Kiran Gaddipati, Farhana Keya, Gollam Rabby, Sören Auer

TL;DR

This work introduces AIssistant, the first open-source agentic framework for Human--AI collaborative generation of scientific perspectives and review research in data science, demonstrating that agent-augmented pipelines substantially reduce effort while maintaining research integrity through strategic human oversight.

Abstract

High-quality scientific review and perspective papers require substantial time and effort, limiting researchers' ability to synthesize emerging knowledge. While Large Language Models (LLMs) leverage AI Scientists for scientific workflows, existing frameworks focus primarily on autonomous workflows with very limited human intervention. We introduce AIssistant, the first open-source agentic framework for Human--AI collaborative generation of scientific perspectives and review research in data science. AIssistant employs specialized LLM-driven agents augmented with external scholarly tools and allows human intervention throughout the workflow. The framework consists of two main multi-agent systems: Research Workflow with seven agents and a Paper Writing Workflow with eight agents. We conducted a comprehensive evaluation with both human expert reviewers and LLM-based assessment following NeurIPS standards. Our experiments show that OpenAI o1 achieves the highest quality scores on chain-of-thought prompting with augmented Literature Search tools. We also conducted a Human--AI interaction survey with results showing a 65.7\% time savings. We believe that our work establishes a baseline for Human--AI collaborative scientific workflow for review and perspective research in data science, demonstrating that agent-augmented pipelines substantially reduce effort while maintaining research integrity through strategic human oversight.

AISSISTANT: Human-AI Collaborative Review and Perspective Research Workflows in Data Science

TL;DR

This work introduces AIssistant, the first open-source agentic framework for Human--AI collaborative generation of scientific perspectives and review research in data science, demonstrating that agent-augmented pipelines substantially reduce effort while maintaining research integrity through strategic human oversight.

Abstract

High-quality scientific review and perspective papers require substantial time and effort, limiting researchers' ability to synthesize emerging knowledge. While Large Language Models (LLMs) leverage AI Scientists for scientific workflows, existing frameworks focus primarily on autonomous workflows with very limited human intervention. We introduce AIssistant, the first open-source agentic framework for Human--AI collaborative generation of scientific perspectives and review research in data science. AIssistant employs specialized LLM-driven agents augmented with external scholarly tools and allows human intervention throughout the workflow. The framework consists of two main multi-agent systems: Research Workflow with seven agents and a Paper Writing Workflow with eight agents. We conducted a comprehensive evaluation with both human expert reviewers and LLM-based assessment following NeurIPS standards. Our experiments show that OpenAI o1 achieves the highest quality scores on chain-of-thought prompting with augmented Literature Search tools. We also conducted a Human--AI interaction survey with results showing a 65.7\% time savings. We believe that our work establishes a baseline for Human--AI collaborative scientific workflow for review and perspective research in data science, demonstrating that agent-augmented pipelines substantially reduce effort while maintaining research integrity through strategic human oversight.

Paper Structure

This paper contains 14 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: A schematic overview of the AIssistant framework, illustrating integrated research and paper-writing workflows with Human-in-the-Loop and flexible tool selection at each agent state.
  • Figure 2: The weighted average scores of human and LLM-based evaluation of generated perspective and review papers using AIssistant. ZS = Zero Shot, FS = Few Shot, CoT = Chain-of-Thought, Eval. = Evaluation, Persp. = Perspective.
  • Figure 3: Human and LLM-based evaluation score distributions across all conditions. The y-axis shows combination of the each condition: LLM = LLM-based evaluation, Human = Human evaluation, rev = Review papers, persp = Perspective papers, LST = Literature Search Tools, R1 = Reviewer 1, R2 = Reviewer 2, R3 = Reviewer 3.
  • Figure 4: Comparison of score Improvement ($\Delta$) by LLM-based evaluation and human evaluation across prompting strategies. Panels (a) and (b) show results for the GPT-4o-mini, while (c) and (d) show results for the OpenAI o1, across both Review and Perspective paper types. Human evaluation generally shows positive improvement, particularly for the OpenAI o1, demonstrating the value of human collaboration in the proposed AIssistant framework.
  • Figure 5: Comparison of time taken (a) and costs (b) using different LLMs and prompting strategies in AIssistant.