Table of Contents
Fetching ...

TAMA: A Human-AI Collaborative Thematic Analysis Framework Using Multi-Agent LLMs for Clinical Interviews

Huimin Xu, Seungjun Yi, Terence Lim, Jiawei Xu, Andrew Well, Carlos Mery, Aidong Zhang, Yuji Zhang, Heng Ji, Keshav Pingali, Yan Leng, Ying Ding

TL;DR

The paper tackles the resource-intensive nature of thematic analysis (TA) in healthcare by introducing TAMA, a Human-AI collaborative framework that uses multi-agent LLMs guided by a domain expert. TAMA coordinates a Team of three agents (Generation, Evaluation, Refinement) with a cardiac expert to generate, evaluate, and refine themes from clinical transcripts, achieving improved distinctiveness and alignment with human themes while drastically reducing manual workload. The framework is evaluated on de-identified AAOCA parent transcripts, employing metrics such as Jaccard Similarity, Hit Rate, and embedding-based cosine similarity, and demonstrates that automated TA can be performed in under 10 minutes with comparable thematic depth to manual analysis. The work supports broader adoption of automated TA in high-stakes clinical contexts, offering a scalable, human-in-the-loop approach to qualitative research that balances efficiency with reliability and clinical relevance.

Abstract

Thematic analysis (TA) is a widely used qualitative approach for uncovering latent meanings in unstructured text data. TA provides valuable insights in healthcare but is resource-intensive. Large Language Models (LLMs) have been introduced to perform TA, yet their applications in healthcare remain unexplored. Here, we propose TAMA: A Human-AI Collaborative Thematic Analysis framework using Multi-Agent LLMs for clinical interviews. We leverage the scalability and coherence of multi-agent systems through structured conversations between agents and coordinate the expertise of cardiac experts in TA. Using interview transcripts from parents of children with Anomalous Aortic Origin of a Coronary Artery (AAOCA), a rare congenital heart disease, we demonstrate that TAMA outperforms existing LLM-assisted TA approaches, achieving higher thematic hit rate, coverage, and distinctiveness. TAMA demonstrates strong potential for automated TA in clinical settings by leveraging multi-agent LLM systems with human-in-the-loop integration by enhancing quality while significantly reducing manual workload.

TAMA: A Human-AI Collaborative Thematic Analysis Framework Using Multi-Agent LLMs for Clinical Interviews

TL;DR

The paper tackles the resource-intensive nature of thematic analysis (TA) in healthcare by introducing TAMA, a Human-AI collaborative framework that uses multi-agent LLMs guided by a domain expert. TAMA coordinates a Team of three agents (Generation, Evaluation, Refinement) with a cardiac expert to generate, evaluate, and refine themes from clinical transcripts, achieving improved distinctiveness and alignment with human themes while drastically reducing manual workload. The framework is evaluated on de-identified AAOCA parent transcripts, employing metrics such as Jaccard Similarity, Hit Rate, and embedding-based cosine similarity, and demonstrates that automated TA can be performed in under 10 minutes with comparable thematic depth to manual analysis. The work supports broader adoption of automated TA in high-stakes clinical contexts, offering a scalable, human-in-the-loop approach to qualitative research that balances efficiency with reliability and clinical relevance.

Abstract

Thematic analysis (TA) is a widely used qualitative approach for uncovering latent meanings in unstructured text data. TA provides valuable insights in healthcare but is resource-intensive. Large Language Models (LLMs) have been introduced to perform TA, yet their applications in healthcare remain unexplored. Here, we propose TAMA: A Human-AI Collaborative Thematic Analysis framework using Multi-Agent LLMs for clinical interviews. We leverage the scalability and coherence of multi-agent systems through structured conversations between agents and coordinate the expertise of cardiac experts in TA. Using interview transcripts from parents of children with Anomalous Aortic Origin of a Coronary Artery (AAOCA), a rare congenital heart disease, we demonstrate that TAMA outperforms existing LLM-assisted TA approaches, achieving higher thematic hit rate, coverage, and distinctiveness. TAMA demonstrates strong potential for automated TA in clinical settings by leveraging multi-agent LLM systems with human-in-the-loop integration by enhancing quality while significantly reducing manual workload.

Paper Structure

This paper contains 24 sections, 3 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The TAMA Framework is a human-in-the-loop, multi-agent system designed to generate, evaluate, and refine themes from clinical interview transcripts.
  • Figure 2: Human-in-the-Loop in TAMA Framework. The Cardiac expert actively collaborates with each LLM agent to provide domain expertise to ensure accurate and clinically relevant outcomes.
  • Figure 3: Comparison of Jaccard Similarity and HIT Rate for LLM-Generated Themes (Before and After Evaluation) Relative to Human-Generated Themes. Higher values for both metrics indicate a greater overlap between LLM-generated themes and human-generated themes.
  • Figure 4: Similarity Matrix Between Human-Generated Themes and LLM-Generated Themes (Before and after Evaluation.) Higher scores indicate greater similarity (1 = perfect overlap, and 0 = no overlap). Each row represents a human-generated theme, and each column represents an LLM-generated theme. Full theme names are listed in Table \ref{['tab:theme_comparison']}. Cell values indicate similarity scores between the two. The first sentence of each theme description is used for comparison, as theme names are too short for evaluation.