Principles and Guidelines for Evaluating Social Robot Navigation Algorithms

Anthony Francis; Claudia Pérez-D'Arpino; Chengshu Li; Fei Xia; Alexandre Alahi; Rachid Alami; Aniket Bera; Abhijat Biswas; Joydeep Biswas; Rohan Chandra; Hao-Tien Lewis Chiang; Michael Everett; Sehoon Ha; Justin Hart; Jonathan P. How; Haresh Karnan; Tsang-Wei Edward Lee; Luis J. Manso; Reuth Mirksy; Sören Pirk; Phani Teja Singamaneni; Peter Stone; Ada V. Taylor; Peter Trautman; Nathan Tsoi; Marynel Vázquez; Xuesu Xiao; Peng Xu; Naoki Yokoyama; Alexander Toshev; Roberto Martín-Martín

Principles and Guidelines for Evaluating Social Robot Navigation Algorithms

Anthony Francis, Claudia Pérez-D'Arpino, Chengshu Li, Fei Xia, Alexandre Alahi, Rachid Alami, Aniket Bera, Abhijat Biswas, Joydeep Biswas, Rohan Chandra, Hao-Tien Lewis Chiang, Michael Everett, Sehoon Ha, Justin Hart, Jonathan P. How, Haresh Karnan, Tsang-Wei Edward Lee, Luis J. Manso, Reuth Mirksy, Sören Pirk, Phani Teja Singamaneni, Peter Stone, Ada V. Taylor, Peter Trautman, Nathan Tsoi, Marynel Vázquez, Xuesu Xiao, Peng Xu, Naoki Yokoyama, Alexander Toshev, Roberto Martín-Martín

TL;DR

The paper tackles the challenge of evaluating social robot navigation by proposing a principled benchmarking framework grounded in eight guiding principles. It defines social robot navigation, develops a taxonomy for metrics, scenarios, benchmarks, datasets, and simulators, and advocates a common API to harmonize evaluations across platforms. The work emphasizes a lifecycle approach—from data collection and issue discovery to benchmark challenges—and provides concrete guidelines for real-world studies, scenario design, benchmark construction, and simulator interoperability. By outlining a unified evaluation framework and metrics API, it aims to enable fair comparisons, reveal limitations, and accelerate progress in deploying socially aware robots in human environments.

Abstract

A major challenge to deploying robots widely is navigation in human-populated environments, commonly referred to as social robot navigation. While the field of social navigation has advanced tremendously in recent years, the fair evaluation of algorithms that tackle social navigation remains hard because it involves not just robotic agents moving in static environments but also dynamic human agents and their perceptions of the appropriateness of robot behavior. In contrast, clear, repeatable, and accessible benchmarks have accelerated progress in fields like computer vision, natural language processing and traditional robot navigation by enabling researchers to fairly compare algorithms, revealing limitations of existing solutions and illuminating promising new directions. We believe the same approach can benefit social navigation. In this paper, we pave the road towards common, widely accessible, and repeatable benchmarking criteria to evaluate social robot navigation. Our contributions include (a) a definition of a socially navigating robot as one that respects the principles of safety, comfort, legibility, politeness, social competency, agent understanding, proactivity, and responsiveness to context, (b) guidelines for the use of metrics, development of scenarios, benchmarks, datasets, and simulators to evaluate social navigation, and (c) a design of a social navigation metrics framework to make it easier to compare results from different simulators, robots and datasets.

Principles and Guidelines for Evaluating Social Robot Navigation Algorithms

TL;DR

Abstract

Paper Structure (90 sections, 11 figures, 6 tables)

This paper contains 90 sections, 11 figures, 6 tables.

Introduction
Related Work
Towards a Definition of Social Navigation
What is a Social Robot?
Principles of Social Navigation
Research Methodologies of Social Navigation
Research Questions of Social Navigation
Types of Social Navigation Studies
Lifecycle of Social Navigation Research
Guidelines for Real-world Studies
A Taxonomy of Social Navigation
A Taxonomy for Analysis
Factors Common to Social Navigation
Social Navigation Metrics
Taxonomy of Existing Social Navigation Metrics
...and 75 more sections

Figures (11)

Figure 1: We identify eight broad principles of social robot navigation - including safety, comfort, legibility, politeness, social competency, agent understanding, proactivity, and contextual appropriateness - which motivate specific guidelines for experiments, metrics, scenarios, benchmarks, datasets, and simulators. Principles and guidelines are labeled with two-letter codes, with P for principles, R for real-world issues, M for metrics, N for scenarios, B for benchmarks, D for datasets, and S for simulators.
Figure 2: We define a socially navigating robot as one that interacts with humans and other robots in a way that achieves its navigation goals while enabling other agents to achieve theirs. To make this objective achievable, we propose eight principles for social robot navigation: safety, comfort, legibility, politeness, social competency, agent understanding, proactivity, and contextual appropriateness.
Figure 3: Contextual factors of social navigation. While the first seven principles represent factors to optimize, the eighth principle, contextual appropriateness, calls out that the weighting of these factors can be affected by many features, including cultural, diversity, environmental, task and interpersonal context. Lines in the diagram are representative of common interactions but are not exclusive.
Figure 4: Lifecycle of social navigation research. Field studies, robot deployments, and staged social interactions can be used to collect data, which helps identify issues and their prevalence. Issues discovered guide laboratory experiments and the development of social navigation scenarios, which in turn can inform data collection. Issue discovery also helps guide the development of benchmarks that test these issues, along with public benchmark challenges; attempts at solutions of these challenges can also help identify issues.
Figure 5: A taxonomy of social navigation. Most social navigation instruments share common factors like overall context, physical environments, human user type, robot role and task, and so on. However, datasets, benchmarks and simulators have additional factors particular to them.
...and 6 more figures

Principles and Guidelines for Evaluating Social Robot Navigation Algorithms

TL;DR

Abstract

Principles and Guidelines for Evaluating Social Robot Navigation Algorithms

Authors

TL;DR

Abstract

Table of Contents

Figures (11)