Table of Contents
Fetching ...

Automated Survey Collection with LLM-based Conversational Agents

Kurmanbek Kaiyrbekov, Nicholas J Dobbins, Sean D Mooney

TL;DR

An end-to-end survey collection framework driven by conversational Large Language Models that highlights the potential of LLM agents in conducting and analyzing phone surveys for healthcare applications and paves the way for real-world, end-to-end AI-powered phone survey collection systems.

Abstract

Objective: Traditional phone-based surveys are among the most accessible and widely used methods to collect biomedical and healthcare data, however, they are often costly, labor intensive, and difficult to scale effectively. To overcome these limitations, we propose an end-to-end survey collection framework driven by conversational Large Language Models (LLMs). Materials and Methods: Our framework consists of a researcher responsible for designing the survey and recruiting participants, a conversational phone agent powered by an LLM that calls participants and administers the survey, a second LLM (GPT-4o) that analyzes the conversation transcripts generated during the surveys, and a database for storing and organizing the results. To test our framework, we recruited 8 participants consisting of 5 native and 3 non-native english speakers and administered 40 surveys. We evaluated the correctness of LLM-generated conversation transcripts, accuracy of survey responses inferred by GPT-4o and overall participant experience. Results: Survey responses were successfully extracted by GPT-4o from conversation transcripts with an average accuracy of 98% despite transcripts exhibiting an average per-line word error rate of 7.7%. While participants noted occasional errors made by the conversational LLM agent, they reported that the agent effectively conveyed the purpose of the survey, demonstrated good comprehension, and maintained an engaging interaction. Conclusions: Our study highlights the potential of LLM agents in conducting and analyzing phone surveys for healthcare applications. By reducing the workload on human interviewers and offering a scalable solution, this approach paves the way for real-world, end-to-end AI-powered phone survey collection systems.

Automated Survey Collection with LLM-based Conversational Agents

TL;DR

An end-to-end survey collection framework driven by conversational Large Language Models that highlights the potential of LLM agents in conducting and analyzing phone surveys for healthcare applications and paves the way for real-world, end-to-end AI-powered phone survey collection systems.

Abstract

Objective: Traditional phone-based surveys are among the most accessible and widely used methods to collect biomedical and healthcare data, however, they are often costly, labor intensive, and difficult to scale effectively. To overcome these limitations, we propose an end-to-end survey collection framework driven by conversational Large Language Models (LLMs). Materials and Methods: Our framework consists of a researcher responsible for designing the survey and recruiting participants, a conversational phone agent powered by an LLM that calls participants and administers the survey, a second LLM (GPT-4o) that analyzes the conversation transcripts generated during the surveys, and a database for storing and organizing the results. To test our framework, we recruited 8 participants consisting of 5 native and 3 non-native english speakers and administered 40 surveys. We evaluated the correctness of LLM-generated conversation transcripts, accuracy of survey responses inferred by GPT-4o and overall participant experience. Results: Survey responses were successfully extracted by GPT-4o from conversation transcripts with an average accuracy of 98% despite transcripts exhibiting an average per-line word error rate of 7.7%. While participants noted occasional errors made by the conversational LLM agent, they reported that the agent effectively conveyed the purpose of the survey, demonstrated good comprehension, and maintained an engaging interaction. Conclusions: Our study highlights the potential of LLM agents in conducting and analyzing phone surveys for healthcare applications. By reducing the workload on human interviewers and offering a scalable solution, this approach paves the way for real-world, end-to-end AI-powered phone survey collection systems.

Paper Structure

This paper contains 22 sections, 1 equation, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Overview of survey collection and analysis system. Our framework consists of a researcher that prepares a survey and writes necessary prompts for large language models, an AI-based conversable phone agent that calls participants to conducts the survey, a survey participant, a large language model that analyzes conversations to deduce answers to individual survey questions and a database for storing results.
  • Figure 2: Representation of fictitious persona generation and survey response based on the persona. It consists of 3 main steps: 1) Fictitious survey response generation with answers to each survey question are probabilistically selected from possible options. 2) Fictitious persona generation using GPT-4o based on the fictitious survey. 3) Participant response to survey based on fictitious persona.
  • Figure 3: Average accuracy for each survey question. To compute the average accuracy, the accuracy for each question was calculated for each participant as the percentage of correct responses across five personas. The final averages were then obtained by aggregating the accuracies across all participants.
  • Figure 4: Results from the post-survey questionnaire, illustrating participants' agreement with various statements. Each statement is listed on the right, accompanied by horizontal bars that represent the proportion of participants selecting each response option. Bar colors correspond to specific response categories, as indicated in the legend.
  • Figure S1: Assignment and role description for GPT-4o agent that analyzes conversation transcripts.
  • ...and 1 more figures