Table of Contents
Fetching ...

CollabCoder: A Lower-barrier, Rigorous Workflow for Inductive Collaborative Qualitative Analysis with Large Language Models

Jie Gao, Yuchen Guo, Gionnieve Lim, Tianqin Zhang, Zheng Zhang, Toby Jia-Jun Li, Simon Tangi Perrault

TL;DR

CollabCoder introduces a three-phase, inductive CQA workflow that integrates LLMs to support independent open coding, iterative discussion, and codebook generation. The system emphasizes coders' independence, explicit decision-making documentation, and quantitative metrics to surface (dis)agreements, aiming for rigorous yet accessible qualitative analysis. An empirical evaluation with 16 participants shows CollabCoder improves learning curve, fosters mutual understanding, and enhances discussion efficiency relative to Atlas.ti Web, while highlighting challenges around autonomy, AI reliance, and feature usefulness. The work contributes design guidelines, an end-to-end AI-assisted CQA workflow, and practical insights on human-AI collaboration in qualitative analysis, with implications for scalable, rigorous qualitative research workflows.

Abstract

Collaborative Qualitative Analysis (CQA) can enhance qualitative analysis rigor and depth by incorporating varied viewpoints. Nevertheless, ensuring a rigorous CQA procedure itself can be both demanding and costly. To lower this bar, we take a theoretical perspective to design the CollabCoder workflow, that integrates Large Language Models (LLMs) into key inductive CQA stages: independent open coding, iterative discussions, and final codebook creation. In the open coding phase, CollabCoder offers AI-generated code suggestions and records decision-making data. During discussions, it promotes mutual understanding by sharing this data within the coding team and using quantitative metrics to identify coding (dis)agreements, aiding in consensus-building. In the code grouping stage, CollabCoder provides primary code group suggestions, lightening the cognitive load of finalizing the codebook. A 16-user evaluation confirmed the effectiveness of CollabCoder, demonstrating its advantages over existing software and providing empirical insights into the role of LLMs in the CQA practice.

CollabCoder: A Lower-barrier, Rigorous Workflow for Inductive Collaborative Qualitative Analysis with Large Language Models

TL;DR

CollabCoder introduces a three-phase, inductive CQA workflow that integrates LLMs to support independent open coding, iterative discussion, and codebook generation. The system emphasizes coders' independence, explicit decision-making documentation, and quantitative metrics to surface (dis)agreements, aiming for rigorous yet accessible qualitative analysis. An empirical evaluation with 16 participants shows CollabCoder improves learning curve, fosters mutual understanding, and enhances discussion efficiency relative to Atlas.ti Web, while highlighting challenges around autonomy, AI reliance, and feature usefulness. The work contributes design guidelines, an end-to-end AI-assisted CQA workflow, and practical insights on human-AI collaboration in qualitative analysis, with implications for scalable, rigorous qualitative research workflows.

Abstract

Collaborative Qualitative Analysis (CQA) can enhance qualitative analysis rigor and depth by incorporating varied viewpoints. Nevertheless, ensuring a rigorous CQA procedure itself can be both demanding and costly. To lower this bar, we take a theoretical perspective to design the CollabCoder workflow, that integrates Large Language Models (LLMs) into key inductive CQA stages: independent open coding, iterative discussions, and final codebook creation. In the open coding phase, CollabCoder offers AI-generated code suggestions and records decision-making data. During discussions, it promotes mutual understanding by sharing this data within the coding team and using quantitative metrics to identify coding (dis)agreements, aiding in consensus-building. In the code grouping stage, CollabCoder provides primary code group suggestions, lightening the cognitive load of finalizing the codebook. A 16-user evaluation confirmed the effectiveness of CollabCoder, demonstrating its advantages over existing software and providing empirical insights into the role of LLMs in the CQA practice.
Paper Structure (73 sections, 11 figures, 10 tables)

This paper contains 73 sections, 11 figures, 10 tables.

Figures (11)

  • Figure 1: Collaborative Qualitative Analysis (CQA) corbin2009basiccorbin1990groundedrichards2018practical is an iterative process involving multiple rounds of iteration among coders to reach a final consensus. Our goal with CollabCoder is to assist users across key stages of the CQA process.
  • Figure 2: CollabCoder Workflow. The lead coder Alice first splits qualitative data into small units of analysis, e.g., sentence, paragraph, prior to the formal coding. Alice and Bob then: Phase 1: independently perform open coding with GPT assistance; Phase 2: merge, discuss, and make decisions on codes, assisted by GPT; Phase 3: utilize GPT to generate code groups for decided codes and perform editing. They can write reports based on the codebook and the identified themes after the formal coding process.
  • Figure 3: Precoding: establish consistent data units and enlist coding team during project creation. The primary coder, Alice, can: 1) name the project, 2) incorporate data, ensuring it aligns with mutually agreed data units, 2a) illustrate how CollabCoder manages the imported data units, 3) define the coding granularity (e.g., sentence or paragraph), 4) invite a secondary coder, Bob, to the project, and 5) initiate the project.
  • Figure 4: Editing Interface for Phase 1: 1) inputting customized code for the text in "Raw Data", either 1a) choosing from the GPT's recommendations, 1b) choosing from the top three relevant codes; 2) adding keywords support by 2a) selecting from raw data and "Add As Support"; 3) assigning a certainty level ranging from 1 to 5, where 1="very uncertain" and 5="very certain"; and 4) reviewing and modifying the individual codebook.
  • Figure 5: Comparison Interface for Phase 2. Users can discuss and reach a consensus by following these steps: 1) reviewing another coder's progress and 1a) clicking on the checkbox only if both individuals complete their coding, 2) two coders' codes are listed in the same interface, 3) calculating the similarity between code pairs and 3a) IRR between coders, 4) sorting the similarity scores from highest to lowest and identifying (dis)agreements, and 4a) making a decision through discussion based on the initial codes, raw data, and code supports or utilizing the GPT's three potential code decision suggestions. Additionally, users have the option to "Replace" the original codes proposed by two coders and revert back to the original codes if required. They can also replace or revert all code decisions with a single click on the top bar.
  • ...and 6 more figures