ROS 2 on a Chip, Achieving Brain-Like Speeds and Efficiency in Robotic Networking

Víctor Mayoral-Vilches; Juan Manuel Reina-Muñoz; Martiño Crespo-Álvarez; David Mayoral-Vilches

ROS 2 on a Chip, Achieving Brain-Like Speeds and Efficiency in Robotic Networking

Víctor Mayoral-Vilches, Juan Manuel Reina-Muñoz, Martiño Crespo-Álvarez, David Mayoral-Vilches

TL;DR

The paper addresses latency and energy inefficiency in ROS 2 networking for real-time robotics by presenting ROS 2 on a Chip, a hardware-accelerated implementation built on FPGA. The ROBOTCORE architecture maps ROS 2 abstractions, RTPS, and UDP/IP into hardware with DDS interoperability, creating deterministic datapaths that bypass software overhead. Key results show mean latencies near $2\,\mu s$ and maximum latencies as low as $11\,\mu s$, with up to >$30{,}000\times$ improvements in peak latency and >$500\times$ energy efficiency over CPU-based ROS 2. These findings demonstrate brain-like, deterministic robotic networking capabilities and motivate broader adoption of hardware-accelerated ROS in both industrial and teleoperation contexts, while outlining future work to broaden benchmarking, scalability, and security aspects.

Abstract

The Robot Operating System (ROS) pubsub model played a pivotal role in developing sophisticated robotic applications. However, the complexities and real-time demands of modern robotics necessitate more efficient communication solutions that are deterministic and isochronous. This article introduces a groundbreaking approach: embedding ROS 2 message-passing infrastructure directly onto a specialized hardware chip, significantly enhancing speed and efficiency in robotic communications. Our FPGA prototypes of the chip design can send or receive packages in less than 2.5 microseconds, accelerating networking communications by more than 62x on average and improving energy consumption by more than 500x when compared to traditional ROS 2 software implementations on modern CPUs. Additionally, it dramatically reduces maximum latency in ROS 2 networking communication by more than 30,000x. In situations of peak latency, our design guarantees an isochronous response within 11 microseconds, a stark improvement over the potential hundreds of milliseconds reported by modern CPU systems under similar conditions.

ROS 2 on a Chip, Achieving Brain-Like Speeds and Efficiency in Robotic Networking

TL;DR

and maximum latencies as low as

, with up to >

improvements in peak latency and >

energy efficiency over CPU-based ROS 2. These findings demonstrate brain-like, deterministic robotic networking capabilities and motivate broader adoption of hardware-accelerated ROS in both industrial and teleoperation contexts, while outlining future work to broaden benchmarking, scalability, and security aspects.

Abstract

Paper Structure (10 sections, 6 figures, 1 table)

This paper contains 10 sections, 6 figures, 1 table.

Introduction
Background
ROS 2 on a Chip
Processor Design
Thousands-Fold faster, deterministic and isochronous
100-40000x better than other IP cores
500$\times$ more energy-efficient
Discussion and future work
Achieving Brain-Like Speeds and Efficiency in Robotic Networking
Future work

Figures (6)

Figure 1: Mean Round-Trip Network Latency in microseconds (us) breakdown across various combination of hardware and software implementations. Round-Trip Time (RTT) mean latencies measured after 1M samples and while sending small ROS messages in a ping-pong format. ROBOTCORE for ROS 2, ROBOTCORE RTPS and ROBOTCORE UDP/IP running in an FPGA@156MHz. Software implementations running on an AMD Ryzen 5 PRO 4650G.
Figure 2: Resource Utilization Breakdown for the "ROS 2 on a Chip " Design on FPGA Platforms: Detailing the allocations for integrated submodules. ROBOTCORE® for ROS 2 facilitates core ROS 2 abstractions; ROBOTCORE® RTPS provides core RTPS message-passing infrastructure and DDS interoperability; and ROBOTCORE® UDP/IP enables high-speed, fixed low-latency networking stack interactions. This composition highlights the comprehensive approach to hardware-accelerated robotic communication
Figure 3: Hardware accelerator data flow diagrams for UDP/IP, RTPS/DDS, and ROS 2.
Figure 4: Mean and Maximum Round-Trip Network Latency in microseconds (us) breakdown across various combination of hardware and software implementations. Logarithmic scale. Figure depicts how a hardware implementation delivers an isochronous and faster interaction.
Figure 5: Mean Round-Trip Network Latency in microseconds (us) across various combination of hard and soft-processors providing ROS 2 interoperability capabilities. Logarithmic scale.
...and 1 more figures

ROS 2 on a Chip, Achieving Brain-Like Speeds and Efficiency in Robotic Networking

TL;DR

Abstract

ROS 2 on a Chip, Achieving Brain-Like Speeds and Efficiency in Robotic Networking

Authors

TL;DR

Abstract

Table of Contents

Figures (6)