RobotPerf: An Open-Source, Vendor-Agnostic, Benchmarking Suite for Evaluating Robotics Computing System Performance

Víctor Mayoral-Vilches; Jason Jabbour; Yu-Shun Hsiao; Zishen Wan; Martiño Crespo-Álvarez; Matthew Stewart; Juan Manuel Reina-Muñoz; Prateek Nagras; Gaurav Vikhe; Mohammad Bakhshalipour; Martin Pinzger; Stefan Rass; Smruti Panigrahi; Giulio Corradi; Niladri Roy; Phillip B. Gibbons; Sabrina M. Neuman; Brian Plancher; Vijay Janapa Reddi

RobotPerf: An Open-Source, Vendor-Agnostic, Benchmarking Suite for Evaluating Robotics Computing System Performance

Víctor Mayoral-Vilches, Jason Jabbour, Yu-Shun Hsiao, Zishen Wan, Martiño Crespo-Álvarez, Matthew Stewart, Juan Manuel Reina-Muñoz, Prateek Nagras, Gaurav Vikhe, Mohammad Bakhshalipour, Martin Pinzger, Stefan Rass, Smruti Panigrahi, Giulio Corradi, Niladri Roy, Phillip B. Gibbons, Sabrina M. Neuman, Brian Plancher, Vijay Janapa Reddi

TL;DR

Robotic systems must meet real-time timing and power constraints, yet existing benchmarks largely address either domain-specific tasks or isolated hardware settings. RobotPerf responds with an open-source, vendor-agnostic framework built on ROS 2 that benchmarks complete computational graphs using both grey-box and black-box approaches, across CPUs, GPUs, FPGAs, and accelerators. The methodology emphasizes non-functional metrics, reproducibility, and portability, and supports opaque tests when instrumentation is impractical. Evaluations across 18 hardware platforms and multiple robotic workloads demonstrate RobotPerf's ability to compare platforms, quantify acceleration gains, and guide hardware-software co-design for real-time, energy-efficient robotics.

Abstract

We introduce RobotPerf, a vendor-agnostic benchmarking suite designed to evaluate robotics computing performance across a diverse range of hardware platforms using ROS 2 as its common baseline. The suite encompasses ROS 2 packages covering the full robotics pipeline and integrates two distinct benchmarking approaches: black-box testing, which measures performance by eliminating upper layers and replacing them with a test application, and grey-box testing, an application-specific measure that observes internal system states with minimal interference. Our benchmarking framework provides ready-to-use tools and is easily adaptable for the assessment of custom ROS 2 computational graphs. Drawing from the knowledge of leading robot architects and system architecture experts, RobotPerf establishes a standardized approach to robotics benchmarking. As an open-source initiative, RobotPerf remains committed to evolving with community input to advance the future of hardware-accelerated robotics.

RobotPerf: An Open-Source, Vendor-Agnostic, Benchmarking Suite for Evaluating Robotics Computing System Performance

TL;DR

Abstract

Paper Structure (21 sections, 3 figures, 3 tables)

This paper contains 21 sections, 3 figures, 3 tables.

Introduction
Background & Related Work
The Robot Operating System (ROS and ROS 2)
Robotics Benchmarks
RobotPerf: Principles & Methodology
Non-Functional Performance Testing
ROS 2 Integration & Adaptability
Platform Independence & Portability
Flexible Methodology
Grey-Box Testing
Black-Box Testing
Opaque Performance Tests
Reproducibility & Consistency
Metrics
Current Benchmarks and Categories
...and 6 more sections

Figures (3)

Figure 1: A high level overview of RobotPerf. It targets industry-grade real-time systems with complex and extensible computation graphs using the Robot Operating System (ROS 2) as its common baseline. Emphasizing adaptability, portability, and a community-driven approach, RobotPerf aims to provide fair comparisons of ROS 2 computational graphs across CPUs, GPUs, FPGAs and other accelerators.
Figure 2: Benchmark comparison of perception latency (ms) on AMD's Kria KR260 with and without the ROBOTCORE Perception accelerator. The benchmarks used are a1, a2, and a5 as defined in Table \ref{['tab:benchmarks_table']}. We find that hardware acceleration can enable performance gains of as much as 11.5$\times$.
Figure 3: Benchmarking results on diverse hardware platforms across perception, localization, control, and manipulation workloads defined in RobotPerf beta Benchmarks. Radar plots illustrate the latency, throughput, and power consumption for each hardware solution and workload, with reported values representing the maximum across a series of runs. The labels of vertices represent the workloads defined in Table \ref{['tab:benchmarks_table']}. Each hardware platform and performance testing procedure is delineated by a separate color, with darker colors representing Black-box testing and lighter colors Grey-box testing. In the figure's key, the hardware platforms are categorized into four specific types: general-purpose hardware, heterogeneous hardware, reconfigurable hardware, and accelerator hardware. Within each category, the platforms are ranked based on their Thermal Design Power (TDP), which indicates the maximum power they can draw under load. The throughput values for manipulation tasks and power values for localization tasks have not been incorporated into the beta version of RobotPerf. As RobotPerf continues to evolve, more results will be added in subsequent iterations.

RobotPerf: An Open-Source, Vendor-Agnostic, Benchmarking Suite for Evaluating Robotics Computing System Performance

TL;DR

Abstract

RobotPerf: An Open-Source, Vendor-Agnostic, Benchmarking Suite for Evaluating Robotics Computing System Performance

Authors

TL;DR

Abstract

Table of Contents

Figures (3)