Table of Contents
Fetching ...

Free Lunch for User Experience: Crowdsourcing Agents for Scalable User Studies

Siyang Liu, Sahand Sabour, Xiaoyang Wang, Rada Mihalcea

TL;DR

This work tackles the recruitment cost and diversity limits of UX studies by introducing Crowdsourcing Simulated User Agents (CSUA), a scalable framework that recruits AI agents from billion-scale profile assets to act as study participants. The authors implement a four-stage pipeline—onboarding, screening, experiencing, and feedback—along with an open-source toolkit, and validate the approach with a game prototyping study that yields 240 simulated players from 2,900 profiles. They demonstrate a clear scaling effect: aggregated simulations approach the breadth of human findings, plateauing around $0.90$ coverage, with $12.8$ simulated agents equating to one local human and $3.2$ equating to one crowdsourced human. Expert designers rate the agent-based outputs as a favorable balance of fidelity, cost, time efficiency, and usefulness, supporting the view that simulated participants are a valuable complementary tool for rapid, scalable UX prototyping. Overall, CSUA provides a practical, reusable pathway to scalable UX studies that expand participant diversity and accelerate iteration without fully replacing human studies.

Abstract

User studies are central to user experience research, yet recruiting participant is expensive, slow, and limited in diversity. Recent work has explored using Large Language Models as simulated users, but doubts about fidelity have hindered practical adoption. We deepen this line of research by asking whether scale itself can enable useful simulation, even if not perfectly accurate. We introduce Crowdsourcing Simulated User Agents, a method that recruits generative agents from billion-scale profile assets to act as study participants. Unlike handcrafted simulations, agents are treated as recruitable, screenable, and engageable across UX research stages. To ground this method, we demonstrate a game prototyping study with hundreds of simulated players, comparing their insights against a 10-participant local user study and a 20-participant crowdsourcing study with humans. We find a clear scaling effect: as the number of simulated user agents increases, coverage of human findings rises smoothly and plateaus around 90\%. 12.8 simulated agents are as useful as one locally recruited human, and 3.2 agents are as useful as one crowdsourced human. Results show that while individual agents are imperfect, aggregated simulations produce representative and actionable insights comparable to real users. Professional designers further rated these insights as balancing fidelity, cost, time efficiency, and usefulness. Finally, we release an agent crowdsourcing toolkit with a modular open-source pipeline and a curated pool of profiles synced from ongoing simulation research, to lower the barrier for researchers to adopt simulated participants. Together, this work contributes a validated method and reusable toolkit that expand the options for conducting scalable and practical UX studies.

Free Lunch for User Experience: Crowdsourcing Agents for Scalable User Studies

TL;DR

This work tackles the recruitment cost and diversity limits of UX studies by introducing Crowdsourcing Simulated User Agents (CSUA), a scalable framework that recruits AI agents from billion-scale profile assets to act as study participants. The authors implement a four-stage pipeline—onboarding, screening, experiencing, and feedback—along with an open-source toolkit, and validate the approach with a game prototyping study that yields 240 simulated players from 2,900 profiles. They demonstrate a clear scaling effect: aggregated simulations approach the breadth of human findings, plateauing around coverage, with simulated agents equating to one local human and equating to one crowdsourced human. Expert designers rate the agent-based outputs as a favorable balance of fidelity, cost, time efficiency, and usefulness, supporting the view that simulated participants are a valuable complementary tool for rapid, scalable UX prototyping. Overall, CSUA provides a practical, reusable pathway to scalable UX studies that expand participant diversity and accelerate iteration without fully replacing human studies.

Abstract

User studies are central to user experience research, yet recruiting participant is expensive, slow, and limited in diversity. Recent work has explored using Large Language Models as simulated users, but doubts about fidelity have hindered practical adoption. We deepen this line of research by asking whether scale itself can enable useful simulation, even if not perfectly accurate. We introduce Crowdsourcing Simulated User Agents, a method that recruits generative agents from billion-scale profile assets to act as study participants. Unlike handcrafted simulations, agents are treated as recruitable, screenable, and engageable across UX research stages. To ground this method, we demonstrate a game prototyping study with hundreds of simulated players, comparing their insights against a 10-participant local user study and a 20-participant crowdsourcing study with humans. We find a clear scaling effect: as the number of simulated user agents increases, coverage of human findings rises smoothly and plateaus around 90\%. 12.8 simulated agents are as useful as one locally recruited human, and 3.2 agents are as useful as one crowdsourced human. Results show that while individual agents are imperfect, aggregated simulations produce representative and actionable insights comparable to real users. Professional designers further rated these insights as balancing fidelity, cost, time efficiency, and usefulness. Finally, we release an agent crowdsourcing toolkit with a modular open-source pipeline and a curated pool of profiles synced from ongoing simulation research, to lower the barrier for researchers to adopt simulated participants. Together, this work contributes a validated method and reusable toolkit that expand the options for conducting scalable and practical UX studies.

Paper Structure

This paper contains 47 sections, 3 equations, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Crowdsourcing simulated user agents framework. Each stage produces outputs, requires inputs with toolkit supports.
  • Figure 2: Comparison of figures before and after calibration and screening, shown in a single row.
  • Figure 3: Example interaction with NPC Kass. Left: player agent prompt; Right: dialogue with think-aloud and actions.
  • Figure 3: Evaluation scores from three game experts across four user studies.
  • Figure 4: Overlap of qualitative codes across three user studies: 10 local human participants (yellow), 20 crowdsourced human participants (red), and 240 simulated agent participants (blue).
  • ...and 5 more figures