Sim911: Towards Effective and Equitable 9-1-1 Dispatcher Training with an LLM-Enabled Simulation
Zirong Chen, Elizabeth Chason, Noah Mladenovski, Erin Wilson, Kristin Mullen, Stephen Martini, Meiyi Ma
TL;DR
Sim911 addresses the labor-intensive and inequitable nature of traditional 9-1-1 dispatcher training by introducing an LLM-powered simulation platform that grounds conversations in local context. The system integrates knowledge construction, context-aware controlled generation, and looped validation to produce realistic, diverse, and equitable call simulations, including 57 incident types and 14 caller profiles. Real-world deployment with DEC demonstrates strong realism and authenticity, with substantial time savings (26.55 hours) and high trainee endorsement (~90%), while component ablations confirm the necessity of KC, CaCG, and VLC for maintaining quality. This work offers a scalable framework for augmenting training in centers with limited staffing and lays groundwork for extending AI-driven dialogue simulations to other high-stakes domains.
Abstract
Emergency response services are vital for enhancing public safety by safeguarding the environment, property, and human lives. As frontline members of these services, 9-1-1 dispatchers have a direct impact on response times and the overall effectiveness of emergency operations. However, traditional dispatcher training methods, which rely on role-playing by experienced personnel, are labor-intensive, time-consuming, and often neglect the specific needs of underserved communities. To address these challenges, we introduce Sim911, the first training simulation for 9-1-1 dispatchers powered by Large Language Models (LLMs). Sim911 enhances training through three key technical innovations: (1) knowledge construction, which utilizes archived 9-1-1 call data to generate simulations that closely mirror real-world scenarios; (2) context-aware controlled generation, which employs dynamic prompts and vector bases to ensure that LLM behavior aligns with training objectives; and (3) validation with looped correction, which filters out low-quality responses and refines the system performance.
