Table of Contents
Fetching ...

LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding

Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim

TL;DR

Deductive coding is powerful but time-intensive, limiting scale and generalization. The authors propose LACA, a holistic framework that integrates large language models with traditional content analysis via codebook co-development, reliability testing, and final dataset coding to preserve theoretical constructs while reducing manual workload. Through a case study on Trump Tweets and a four-dataset benchmark using GPT-3.5-turbo, LACA demonstrates frequent human-level agreement and substantial coding-time savings, while providing mechanisms to refine codebooks and flag underperforming categories. The work offers practical guidance for implementing LACA, highlights its limitations, and outlines future directions in prompting strategies, transparency, and broader applicability to qualitative coding tasks.

Abstract

Deductive coding is a widely used qualitative research method for determining the prevalence of themes across documents. While useful, deductive coding is often burdensome and time consuming since it requires researchers to read, interpret, and reliably categorize a large body of unstructured text documents. Large language models (LLMs), like ChatGPT, are a class of quickly evolving AI tools that can perform a range of natural language processing and reasoning tasks. In this study, we explore the use of LLMs to reduce the time it takes for deductive coding while retaining the flexibility of a traditional content analysis. We outline the proposed approach, called LLM-assisted content analysis (LACA), along with an in-depth case study using GPT-3.5 for LACA on a publicly available deductive coding data set. Additionally, we conduct an empirical benchmark using LACA on 4 publicly available data sets to assess the broader question of how well GPT-3.5 performs across a range of deductive coding tasks. Overall, we find that GPT-3.5 can often perform deductive coding at levels of agreement comparable to human coders. Additionally, we demonstrate that LACA can help refine prompts for deductive coding, identify codes for which an LLM is randomly guessing, and help assess when to use LLMs vs. human coders for deductive coding. We conclude with several implications for future practice of deductive coding and related research methods.

LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding

TL;DR

Deductive coding is powerful but time-intensive, limiting scale and generalization. The authors propose LACA, a holistic framework that integrates large language models with traditional content analysis via codebook co-development, reliability testing, and final dataset coding to preserve theoretical constructs while reducing manual workload. Through a case study on Trump Tweets and a four-dataset benchmark using GPT-3.5-turbo, LACA demonstrates frequent human-level agreement and substantial coding-time savings, while providing mechanisms to refine codebooks and flag underperforming categories. The work offers practical guidance for implementing LACA, highlights its limitations, and outlines future directions in prompting strategies, transparency, and broader applicability to qualitative coding tasks.

Abstract

Deductive coding is a widely used qualitative research method for determining the prevalence of themes across documents. While useful, deductive coding is often burdensome and time consuming since it requires researchers to read, interpret, and reliably categorize a large body of unstructured text documents. Large language models (LLMs), like ChatGPT, are a class of quickly evolving AI tools that can perform a range of natural language processing and reasoning tasks. In this study, we explore the use of LLMs to reduce the time it takes for deductive coding while retaining the flexibility of a traditional content analysis. We outline the proposed approach, called LLM-assisted content analysis (LACA), along with an in-depth case study using GPT-3.5 for LACA on a publicly available deductive coding data set. Additionally, we conduct an empirical benchmark using LACA on 4 publicly available data sets to assess the broader question of how well GPT-3.5 performs across a range of deductive coding tasks. Overall, we find that GPT-3.5 can often perform deductive coding at levels of agreement comparable to human coders. Additionally, we demonstrate that LACA can help refine prompts for deductive coding, identify codes for which an LLM is randomly guessing, and help assess when to use LLMs vs. human coders for deductive coding. We conclude with several implications for future practice of deductive coding and related research methods.
Paper Structure (29 sections, 4 equations, 5 figures, 10 tables)

This paper contains 29 sections, 4 equations, 5 figures, 10 tables.

Figures (5)

  • Figure 1: LLM-Assisted Content Analysis (LACA) Process Diagram
  • Figure 2: Trump Tweets Prompt
  • Figure 3: BBC News Prompt
  • Figure 4: Ukraine Water Problems Prompt
  • Figure 5: Contrarian Claims Prompt