AI Conversational Interviewing: Transforming Surveys with LLMs as Adaptive Interviewers

Alexander Wuttke; Matthias Aßenmacher; Christopher Klamm; Max M. Lang; Quirin Würschinger; Frauke Kreuter

AI Conversational Interviewing: Transforming Surveys with LLMs as Adaptive Interviewers

Alexander Wuttke, Matthias Aßenmacher, Christopher Klamm, Max M. Lang, Quirin Würschinger, Frauke Kreuter

TL;DR

This paper investigates whether large language models can replace human interviewers to conduct scalable, in-depth conversational interviews. Through a small, pre-registered, controlled study with university students, AI interviewing via GPT-4 and a voice-enabled UI is compared to human interviewing on identical political questions. Results indicate AI interviewing can achieve data quality comparable to traditional methods while offering scalability, though performance hinges on careful prompting, interface design, and input modality; technical latency remains a challenge. The authors provide a comprehensive evaluation pipeline, publish data and code, and offer practical recommendations for implementation and future research, including cross-model comparisons and context-specific interviewer behavior. Overall, the work demonstrates the viability of AI-driven conversational interviewing as a scalable alternative for qualitative data collection with clear avenues for improvement.

Abstract

Traditional methods for eliciting people's opinions face a trade-off between depth and scale: structured surveys enable large-scale data collection but limit respondents' ability to voice their opinions in their own words, while conversational interviews provide deeper insights but are resource-intensive. This study explores the potential of replacing human interviewers with large language models (LLMs) to conduct scalable conversational interviews. Our goal is to assess the performance of AI Conversational Interviewing and to identify opportunities for improvement in a controlled environment. We conducted a small-scale, in-depth study with university students who were randomly assigned to a conversational interview by either AI or human interviewers, both employing identical questionnaires on political topics. Various quantitative and qualitative measures assessed interviewer adherence to guidelines, response quality, participant engagement, and overall interview efficacy. The findings indicate the viability of AI Conversational Interviewing in producing quality data comparable to traditional methods, with the added benefit of scalability. We publish our data and materials for re-use and present specific recommendations for effective implementation.

AI Conversational Interviewing: Transforming Surveys with LLMs as Adaptive Interviewers

TL;DR

Abstract

Paper Structure (62 sections, 6 figures, 1 table)

This paper contains 62 sections, 6 figures, 1 table.

Introduction
Contributions
Related Work
Study Design and Implementation
Procedure
Model setup
User interface
Interview Content
Evaluation Metrics
user Interviewer behavior: Human coding.
user Interview responses: Human coding.
search Interview responses: Computational analysis.
file-alt Structured post-interview survey.
eye Real-time problem recording.
Findings
...and 47 more sections

Figures (6)

Figure 1: Illustration of the concurrent interview settings (human- vs. AI-conducted) and the various metrics (user, eye, file-alt and search) applied to assess interview quality.
Figure 2: Illustrative example of our used Chat Interface structure (with an interaction between an AI agent robot and a user user) of the AI in-depth interview, showcasing how the interviewer engages in active listening by occasionally rehearsing the preceding answer, as instructed (cf. Appendix \ref{['text:ai-interviewer-prompt']}). The input field includes options for text input (paper-plane) and voice input (microphone).
Figure 3: Evaluation for AI (rgb]0.0,0.8,0.0green) vs Human Interviewers (rgb]1.0,0.65,0.0orange), showing the scores (y-axis) across different interview assessment criteria for participants' evaluation of interview file-alt (x-axis).
Figure 4: Evaluation for AI (rgb]0.0,0.8,0.0green) vs Human Interviewers (rgb]1.0,0.65,0.0orange), showing the scores (y-axis) across different interview assessment criteria for human-rated response quality user (x-axis).
Figure 5: Screenshot of the user interface
...and 1 more figures

AI Conversational Interviewing: Transforming Surveys with LLMs as Adaptive Interviewers

TL;DR

Abstract

AI Conversational Interviewing: Transforming Surveys with LLMs as Adaptive Interviewers

Authors

TL;DR

Abstract

Table of Contents

Figures (6)