A Crowd-based Evaluation of Abuse Response Strategies in Conversational Agents
Amanda Cercas Curry, Verena Rieser
TL;DR
This study addresses how conversational agents should respond to verbal abuse by evaluating state-of-the-art strategies through a large crowd-sourced trial. It builds a corpus of abusive prompts, gathers thousands of system replies across diverse deployments, and annotates them by experts, then crowdsources perceived appropriateness. Key findings show polite refusal as the most favorable strategy, with judgments modulated by user age and the preceding abuse context, while data-driven models generally lag behind rule-based and commercial systems. The work advocates for context- and user-adaptive mitigation and calls for real-user testing to refine abuse-response policies in practice.
Abstract
How should conversational agents respond to verbal abuse through the user? To answer this question, we conduct a large-scale crowd-sourced evaluation of abuse response strategies employed by current state-of-the-art systems. Our results show that some strategies, such as "polite refusal" score highly across the board, while for other strategies demographic factors, such as age, as well as the severity of the preceding abuse influence the user's perception of which response is appropriate. In addition, we find that most data-driven models lag behind rule-based or commercial systems in terms of their perceived appropriateness.
