Table of Contents
Fetching ...

AI Apology: A Critical Review of Apology in AI Systems

Hadassah Harland, Richard Dazeley, Hashini Senaratne, Peter Vamplew, Francisco Cruz, Bahareh Nakisa

TL;DR

This paper presents the first synthesis and critical analysis of AI apology research from 2020 to 2023, introducing a Framework for AI apology built on five elements (interaction, offence, recipient, offender, outcomes) and 12 components with six moderators. It synthesizes empirical, theoretical, and technical works to show how apologies can realign misaligned human-AI interactions through affective, regulatory, and informative effects, while highlighting a persistent capability gap in autonomous apology (detection, attribution, explanation, adaptation). The review identifies context-dependent outcomes, mixed evidence for many components, and a need for longitudinal, cross-disciplinary study designs that consider embodiment, anthropomorphism, and user characteristics. It concludes with concrete recommendations to advance the field toward robust, human-aligned apologetic AI systems, including improved measurement practices, simulation environments, and integrated technical capabilities for autonomous apology.

Abstract

Apologies are a powerful tool used in human-human interactions to provide affective support, regulate social processes, and exchange information following a trust violation. The emerging field of AI apology investigates the use of apologies by artificially intelligent systems, with recent research suggesting how this tool may provide similar value in human-machine interactions. Until recently, contributions to this area were sparse, and these works have yet to be synthesised into a cohesive body of knowledge. This article provides the first synthesis and critical analysis of the state of AI apology research, focusing on studies published between 2020 and 2023. We derive a framework of attributes to describe five core elements of apology: outcome, interaction, offence, recipient, and offender. With this framework as the basis for our critique, we show how apologies can be used to recover from misalignment in human-AI interactions, and examine trends and inconsistencies within the field. Among the observations, we outline the importance of curating a human-aligned and cross-disciplinary perspective in this research, with consideration for improved system capabilities and long-term outcomes.

AI Apology: A Critical Review of Apology in AI Systems

TL;DR

This paper presents the first synthesis and critical analysis of AI apology research from 2020 to 2023, introducing a Framework for AI apology built on five elements (interaction, offence, recipient, offender, outcomes) and 12 components with six moderators. It synthesizes empirical, theoretical, and technical works to show how apologies can realign misaligned human-AI interactions through affective, regulatory, and informative effects, while highlighting a persistent capability gap in autonomous apology (detection, attribution, explanation, adaptation). The review identifies context-dependent outcomes, mixed evidence for many components, and a need for longitudinal, cross-disciplinary study designs that consider embodiment, anthropomorphism, and user characteristics. It concludes with concrete recommendations to advance the field toward robust, human-aligned apologetic AI systems, including improved measurement practices, simulation environments, and integrated technical capabilities for autonomous apology.

Abstract

Apologies are a powerful tool used in human-human interactions to provide affective support, regulate social processes, and exchange information following a trust violation. The emerging field of AI apology investigates the use of apologies by artificially intelligent systems, with recent research suggesting how this tool may provide similar value in human-machine interactions. Until recently, contributions to this area were sparse, and these works have yet to be synthesised into a cohesive body of knowledge. This article provides the first synthesis and critical analysis of the state of AI apology research, focusing on studies published between 2020 and 2023. We derive a framework of attributes to describe five core elements of apology: outcome, interaction, offence, recipient, and offender. With this framework as the basis for our critique, we show how apologies can be used to recover from misalignment in human-AI interactions, and examine trends and inconsistencies within the field. Among the observations, we outline the importance of curating a human-aligned and cross-disciplinary perspective in this research, with consideration for improved system capabilities and long-term outcomes.

Paper Structure

This paper contains 74 sections, 8 figures, 8 tables.

Figures (8)

  • Figure 1: An apology is described by five elements: the interaction between the offender and the recipient regarding an offence towards an outcome. Arrows describe how these elements influence each other
  • Figure 2: Attributes of an apology consist of prescriptive components and descriptive moderators
  • Figure 3: Capabilities used in an apology include (1) attribute responsibility, (2) explain what has happened, and (3) adapt future behaviour. Initiation of an apology additionally may require the capability to (0) detect the offence, although this may be circumvented if the offender is informed
  • Figure 4: The Framework of AI apology transposes the five elements of apology into the human-AI case: an interaction between a system (offender) and its user (recipient) regarding some task and context (offence), resulting in an outcome(s). Each of the components, moderators and capabilities described are represented in this framework according to their associated element, with arrows describing how these influence each other
  • Figure 5: The steps applied in the retrieval, screening and eligibility assessment of the literature for this review
  • ...and 3 more figures