Human-in-the-loop or AI-in-the-loop? Automate or Collaborate?

Sriraam Natarajan; Saurabh Mathur; Sahil Sidheekh; Wolfgang Stammer; Kristian Kersting

Human-in-the-loop or AI-in-the-loop? Automate or Collaborate?

Sriraam Natarajan, Saurabh Mathur, Sahil Sidheekh, Wolfgang Stammer, Kristian Kersting

TL;DR

This paper reframes how to categorize human-AI collaboration by distinguishing human-in-the-loop (HIL) and AI-in-the-loop ($AI^2L$) paradigms. It analyzes differences in control authority, sources of bias, and evaluation criteria, arguing that many HIL labels overlook the AI's central role. The authors propose transitioning evaluation away from AI-centered metrics toward user- and population-specific outcomes and advocate nesting of domains to choose appropriately between HIL and $AI^2L$. These insights aim to enable more trustworthy, robust, and context-appropriate human-AI systems across domains.

Abstract

Human-in-the-loop (HIL) systems have emerged as a promising approach for combining the strengths of data-driven machine learning models with the contextual understanding of human experts. However, a deeper look into several of these systems reveals that calling them HIL would be a misnomer, as they are quite the opposite, namely AI-in-the-loop ($AI^2L$) systems, where the human is in control of the system, while the AI is there to support the human. We argue that existing evaluation methods often overemphasize the machine (learning) component's performance, neglecting the human expert's critical role. Consequently, we propose an $AI^2L$ perspective, which recognizes that the human expert is an active participant in the system, significantly influencing its overall performance. By adopting an $AI^2L$ approach, we can develop more comprehensive systems that faithfully model the intricate interplay between the human and machine components, leading to more effective and robust AI systems.

Human-in-the-loop or AI-in-the-loop? Automate or Collaborate?

TL;DR

This paper reframes how to categorize human-AI collaboration by distinguishing human-in-the-loop (HIL) and AI-in-the-loop (

) paradigms. It analyzes differences in control authority, sources of bias, and evaluation criteria, arguing that many HIL labels overlook the AI's central role. The authors propose transitioning evaluation away from AI-centered metrics toward user- and population-specific outcomes and advocate nesting of domains to choose appropriately between HIL and

. These insights aim to enable more trustworthy, robust, and context-appropriate human-AI systems across domains.

Abstract

) systems, where the human is in control of the system, while the AI is there to support the human. We argue that existing evaluation methods often overemphasize the machine (learning) component's performance, neglecting the human expert's critical role. Consequently, we propose an

perspective, which recognizes that the human expert is an active participant in the system, significantly influencing its overall performance. By adopting an

approach, we can develop more comprehensive systems that faithfully model the intricate interplay between the human and machine components, leading to more effective and robust AI systems.

Human-in-the-loop or AI-in-the-loop? Automate or Collaborate?

TL;DR

Abstract

Human-in-the-loop or AI-in-the-loop? Automate or Collaborate?

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (1)