Humanlike Multi-user Agent (HUMA): Designing a Deceptively Human AI Facilitator for Group Chats
Mateusz Jacniacki, Martí Carmona Serrat
TL;DR
This work tackles the challenge of enabling AI to act as a natural facilitator in asynchronous multi-person group chats. It introduces HUMA, an LLM-based facilitator built on an event-driven architecture with three components (Router, Action Agent, Reflection) to manage when to speak, who to address, and how to handle interruptions, including realistic typing delays. The method extends the MUCA 3W framework with 20 strategies, timing regularization, and tool-use constraints to support diverse, context-sensitive participation. In a controlled study with 97 participants across four-person role-play chats, HUMA was nearly indistinguishable from human community managers and yielded comparable subjective experience measures, suggesting practical viability for scalable, trustworthy group chat facilitation.
Abstract
Conversational agents built on large language models (LLMs) are becoming increasingly prevalent, yet most systems are designed for one-on-one, turn-based exchanges rather than natural, asynchronous group chats. As AI assistants become widespread throughout digital platforms, from virtual assistants to customer service, developing natural and humanlike interaction patterns seems crucial for maintaining user trust and engagement. We present the Humanlike Multi-user Agent (HUMA), an LLM-based facilitator that participates in multi-party conversations using human-like strategies and timing. HUMA extends prior multi-user chatbot work with an event-driven architecture that handles messages, replies, reactions and introduces realistic response-time simulation. HUMA comprises three components-Router, Action Agent, and Reflection-which together adapt LLMs to group conversation dynamics. We evaluate HUMA in a controlled study with 97 participants in four-person role-play chats, comparing AI and human community managers (CMs). Participants classified CMs as human at near-chance rates in both conditions, indicating they could not reliably distinguish HUMA agents from humans. Subjective experience was comparable across conditions: community-manager effectiveness, social presence, and engagement/satisfaction differed only modestly with small effect sizes. Our results suggest that, in natural group chat settings, an AI facilitator can match human quality while remaining difficult to identify as nonhuman.
