Using ChatGPT for Thematic Analysis
Aleksei Turobov, Diane Coyle, Verity Harding
TL;DR
The paper investigates the feasibility and rigor of using a custom GPT-based tool to perform initial coding in qualitative thematic analysis, using UN policy documents as a test bed. It demonstrates that GPT can generate hundreds of codes and reveal nuanced thematic shifts, validated in part by Latent Dirichlet Allocation topic modeling, while also highlighting limitations such as descriptive bias, occasional errors, and the need for manual verification. The authors discuss policy changes restricting direct quotations, which affect data-rich qualitative workflows, and advocate a hybrid AI–human approach with careful prompt engineering and verification. Collectively, the work provides a practical, transparent framework for integrating AI into thematic analysis and outlines methodological considerations and safeguards for future research. The significance lies in offering a replicable protocol that can scale qualitative analysis while maintaining rigor and ethical standards in social science research.
Abstract
The utilisation of AI-driven tools, notably ChatGPT, within academic research is increasingly debated from several perspectives including ease of implementation, and potential enhancements in research efficiency, as against ethical concerns and risks such as biases and unexplained AI operations. This paper explores the use of the GPT model for initial coding in qualitative thematic analysis using a sample of UN policy documents. The primary aim of this study is to contribute to the methodological discussion regarding the integration of AI tools, offering a practical guide to validation for using GPT as a collaborative research assistant. The paper outlines the advantages and limitations of this methodology and suggests strategies to mitigate risks. Emphasising the importance of transparency and reliability in employing GPT within research methodologies, this paper argues for a balanced use of AI in supported thematic analysis, highlighting its potential to elevate research efficacy and outcomes.
