Can Large Language Models Serve as Data Analysts? A Multi-Agent Assisted Approach for Qualitative Data Analysis

Zeeshan Rasheed; Muhammad Waseem; Aakash Ahmad; Kai-Kristian Kemell; Wang Xiaofeng; Anh Nguyen Duc; Pekka Abrahamsson

Can Large Language Models Serve as Data Analysts? A Multi-Agent Assisted Approach for Qualitative Data Analysis

Zeeshan Rasheed, Muhammad Waseem, Aakash Ahmad, Kai-Kristian Kemell, Wang Xiaofeng, Anh Nguyen Duc, Pekka Abrahamsson

TL;DR

This paper presents an LLM-based multi-agent system designed to automate qualitative data analysis in empirical software engineering. By deploying 27 specialized agents, the system automates thematic, content, narrative, discourse, and grounded theory analyses, aiming to accelerate analysis, improve consistency, and reduce manual effort. The authors demonstrate autonomous processing across multiple qualitative methods and provide publicly accessible code to enable validation and further exploration. They also discuss limitations and future work, including multilingual performance and ongoing expert feedback, to enhance practical applicability and governance. The work offers a scalable framework for integrating LLM-driven automation into qualitative research workflows with potential impact on researchers and practitioners handling large textual datasets.

Abstract

Context: Manual qualitative data analysis is time-intensive and can compromise validity and replicability, affecting analysis design, implementation, and reporting. Large Language Models (LLMs) enable human-bot collaboration in Software Engineering (SE), but their potential for qualitative data analysis in SE remains largely unexplored. Objective: The objective of this study is to design and develop an LLM-based multi-agent system that synergizes human decision support with AI to automate various qualitative data analysis approaches. Methods: We used LLM-based multi-agents systems to assist the qualitative data analysis process, deploying 27 agents, each responsible for a specific task, such as text summarization, initial code generation, and extracting themes and patterns. Results: The main findings are: (1) the LLM-based multi-agent system accelerates the qualitative data analysis process, (2) the system effectively automates tasks such as text summarization, initial code generation, and theme extraction, and (3) the publicly accessible code facilitates validation and further evaluation. Conclusion: The proposed LLM-based multi-agent system automates qualitative data analysis process, creating opportunities for researchers and practitioners. Future improvements focus on enhancing multilingual performance and integrating continuous expert feedback. The source code of proposed system and system details can be found here: https://github.com/GPT-Laboratory/Qualitative-Analysis-with-an-LLM-Based-Agentts

Can Large Language Models Serve as Data Analysts? A Multi-Agent Assisted Approach for Qualitative Data Analysis

TL;DR

Abstract

Can Large Language Models Serve as Data Analysts? A Multi-Agent Assisted Approach for Qualitative Data Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (2)