User Simulation for Evaluating Information Access Systems
Krisztian Balog, ChengXiang Zhai
TL;DR
The paper surveys the field of user simulation for evaluating information access systems, arguing that traditional offline, online, and user-study methodologies fail to capture the full interactive utility experienced by real users.It presents a general, MDP-based framework to model user-system interactions across information retrieval, recommender systems, and conversational assistants, enabling reproducible, end-to-end evaluation and what-if analyses.Key contributions include a comprehensive taxonomy of simulation approaches (model-based vs data-driven), a unified evaluation framework for sessions and tasks, and practical guidance on instantiation, validation, and tooling, while identifying open challenges in realism, interpretability, and multi-task integration.The work underscores the potential for simulation to bridge academia and industry, support data-efficient evaluation, and advance interdisciplinary research toward more realistic, adaptable interactive AI systems.Overall, the paper argues that robust, interpretable, and extensible user simulators are essential for evaluating complex, interactive information access systems and for guiding future developments in AI-enabled information services.
Abstract
Information access systems, such as search engines, recommender systems, and conversational assistants, have become integral to our daily lives as they help us satisfy our information needs. However, evaluating the effectiveness of these systems presents a long-standing and complex scientific challenge. This challenge is rooted in the difficulty of assessing a system's overall effectiveness in assisting users to complete tasks through interactive support, and further exacerbated by the substantial variation in user behaviour and preferences. To address this challenge, user simulation emerges as a promising solution. This book focuses on providing a thorough understanding of user simulation techniques designed specifically for evaluation purposes. We begin with a background of information access system evaluation and explore the diverse applications of user simulation. Subsequently, we systematically review the major research progress in user simulation, covering both general frameworks for designing user simulators, utilizing user simulation for evaluation, and specific models and algorithms for simulating user interactions with search engines, recommender systems, and conversational assistants. Realizing that user simulation is an interdisciplinary research topic, whenever possible, we attempt to establish connections with related fields, including machine learning, dialogue systems, user modeling, and economics. We end the book with a detailed discussion of important future research directions, many of which extend beyond the evaluation of information access systems and are expected to have broader impact on how to evaluate interactive intelligent systems in general.
