Table of Contents
Fetching ...

Towards interactive evaluations for interaction harms in human-AI systems

Lujain Ibrahim, Saffron Huang, Umang Bhatt, Lama Ahmad, Markus Anderljung

TL;DR

Static, model-centered evaluations fail to capture harms that emerge during sustained human–AI interaction. The paper introduces interactional ethics and the concept of interaction harms to guide interactive evaluations, outlining three organizing principles: ecologically valid scenario design, rigorous human-impact metrics, and diverse participation strategies. It discusses implementation challenges, including ethics, data access, infrastructure, and the translation of findings into stakeholder decisions, aiming to bridge research with governance and practice. This framework seeks to enable more accurate assessment of complex human–AI dynamics, ultimately informing safer deployment and governance of interactive AI systems.

Abstract

Current AI evaluation methods, which rely on static, model-only tests, fail to account for harms that emerge through sustained human-AI interaction. As AI systems proliferate and are increasingly integrated into real-world applications, this disconnect between evaluation approaches and actual usage becomes more significant. In this paper, we propose a shift towards evaluation based on \textit{interactional ethics}, which focuses on \textit{interaction harms} - issues like inappropriate parasocial relationships, social manipulation, and cognitive overreliance that develop over time through repeated interaction, rather than through isolated outputs. First, we discuss the limitations of current evaluation methods, which (1) are static, (2) assume a universal user experience, and (3) have limited construct validity. Drawing on research from human-computer interaction, natural language processing, and the social sciences, we present practical principles for designing interactive evaluations. These include ecologically valid interaction scenarios, human impact metrics, and diverse human participation approaches. Finally, we explore implementation challenges and open research questions for researchers, practitioners, and regulators aiming to integrate interactive evaluations into AI governance frameworks. This work lays the groundwork for developing more effective evaluation methods that better capture the complex dynamics between humans and AI systems.

Towards interactive evaluations for interaction harms in human-AI systems

TL;DR

Static, model-centered evaluations fail to capture harms that emerge during sustained human–AI interaction. The paper introduces interactional ethics and the concept of interaction harms to guide interactive evaluations, outlining three organizing principles: ecologically valid scenario design, rigorous human-impact metrics, and diverse participation strategies. It discusses implementation challenges, including ethics, data access, infrastructure, and the translation of findings into stakeholder decisions, aiming to bridge research with governance and practice. This framework seeks to enable more accurate assessment of complex human–AI dynamics, ultimately informing safer deployment and governance of interactive AI systems.

Abstract

Current AI evaluation methods, which rely on static, model-only tests, fail to account for harms that emerge through sustained human-AI interaction. As AI systems proliferate and are increasingly integrated into real-world applications, this disconnect between evaluation approaches and actual usage becomes more significant. In this paper, we propose a shift towards evaluation based on \textit{interactional ethics}, which focuses on \textit{interaction harms} - issues like inappropriate parasocial relationships, social manipulation, and cognitive overreliance that develop over time through repeated interaction, rather than through isolated outputs. First, we discuss the limitations of current evaluation methods, which (1) are static, (2) assume a universal user experience, and (3) have limited construct validity. Drawing on research from human-computer interaction, natural language processing, and the social sciences, we present practical principles for designing interactive evaluations. These include ecologically valid interaction scenarios, human impact metrics, and diverse human participation approaches. Finally, we explore implementation challenges and open research questions for researchers, practitioners, and regulators aiming to integrate interactive evaluations into AI governance frameworks. This work lays the groundwork for developing more effective evaluation methods that better capture the complex dynamics between humans and AI systems.
Paper Structure (16 sections, 2 figures, 2 tables)