Table of Contents
Fetching ...

PAAM: A Framework for Coordinated and Priority-Driven Accelerator Management in ROS 2

Daniel Enright, Yecheng Xiang, Hyunjong Choi, Hyoseung Kim

TL;DR

PAAM tackles the challenge of predictable accelerator access in ROS 2 by introducing a dedicated accelerator resource server that arbitrates requests with chain-centric priorities. The framework adds a two-level hierarchical queueing model and device-aware execution paths for GPUs and TPUs, enabling per-segment and per-chain worst-case bounds while supporting both local and remote accelerator usage. Empirical results on Jetson hardware show significant reductions in critical-chain latency (up to about 91% in some scenarios) and robust adherence to analytical bounds, with overhead primarily attributed to DDS data transport. By enabling granular, priority-driven scheduling at the application layer without modifying accelerator drivers, PAAM offers a practical pathway to real-time, safety-critical robotic systems that leverage heterogeneous accelerators.

Abstract

This paper proposes a Priority-driven Accelerator Access Management (PAAM) framework for multi-process robotic applications built on top of the Robot Operating System (ROS) 2 middleware platform. The framework addresses the issue of predictable execution of time- and safety-critical callback chains that require hardware accelerators such as GPUs and TPUs. PAAM provides a standalone ROS executor that acts as an accelerator resource server, arbitrating accelerator access requests from all other callbacks at the application layer. This approach enables coordinated and priority-driven accelerator access management in multi-process robotic systems. The framework design is directly applicable to all types of accelerators and enables granular control over how specific chains access accelerators, making it possible to achieve predictable real-time support for accelerators used by safety-critical callback chains without making changes to underlying accelerator device drivers. The paper shows that PAAM also offers a theoretical analysis that can upper bound the worst-case response time of safety-critical callback chains that necessitate accelerator access. This paper also demonstrates that complex robotic systems with extensive accelerator usage that are integrated with PAAM may achieve up to a 91\% reduction in end-to-end response time of their critical callback chains.

PAAM: A Framework for Coordinated and Priority-Driven Accelerator Management in ROS 2

TL;DR

PAAM tackles the challenge of predictable accelerator access in ROS 2 by introducing a dedicated accelerator resource server that arbitrates requests with chain-centric priorities. The framework adds a two-level hierarchical queueing model and device-aware execution paths for GPUs and TPUs, enabling per-segment and per-chain worst-case bounds while supporting both local and remote accelerator usage. Empirical results on Jetson hardware show significant reductions in critical-chain latency (up to about 91% in some scenarios) and robust adherence to analytical bounds, with overhead primarily attributed to DDS data transport. By enabling granular, priority-driven scheduling at the application layer without modifying accelerator drivers, PAAM offers a practical pathway to real-time, safety-critical robotic systems that leverage heterogeneous accelerators.

Abstract

This paper proposes a Priority-driven Accelerator Access Management (PAAM) framework for multi-process robotic applications built on top of the Robot Operating System (ROS) 2 middleware platform. The framework addresses the issue of predictable execution of time- and safety-critical callback chains that require hardware accelerators such as GPUs and TPUs. PAAM provides a standalone ROS executor that acts as an accelerator resource server, arbitrating accelerator access requests from all other callbacks at the application layer. This approach enables coordinated and priority-driven accelerator access management in multi-process robotic systems. The framework design is directly applicable to all types of accelerators and enables granular control over how specific chains access accelerators, making it possible to achieve predictable real-time support for accelerators used by safety-critical callback chains without making changes to underlying accelerator device drivers. The paper shows that PAAM also offers a theoretical analysis that can upper bound the worst-case response time of safety-critical callback chains that necessitate accelerator access. This paper also demonstrates that complex robotic systems with extensive accelerator usage that are integrated with PAAM may achieve up to a 91\% reduction in end-to-end response time of their critical callback chains.
Paper Structure (30 sections, 5 theorems, 8 equations, 14 figures, 1 table)

This paper contains 30 sections, 5 theorems, 8 equations, 14 figures, 1 table.

Key Result

Lemma 1

The maximum number of requests that an accelerator segment $\tau_{k,q}$ from a schedulable chain $\Gamma_{c'}$ can generate in an arbitrary interval $t$ is bounded by

Figures (14)

  • Figure 1: Chain configuration of Apex.AI's Autoware reference system web:ros2refsys: the numbers in rounded boxes are relative callback priorities (higher means higher priority); node colors mean node-to-executor allocation when four single-threaded executors are used; both priority assignment and executor allocation follow the ones provided in choi2022priority
  • Figure 2: ROS 2 PAAM Framework
  • Figure 3: Client Registration Sequence
  • Figure 4: Server Data Structure of Client Information
  • Figure 5: Sample Shared Memory Region
  • ...and 9 more figures

Theorems & Definitions (9)

  • Lemma 1: arrival bound
  • proof
  • Lemma 2: handling time per segment
  • proof
  • Lemma 3: handling time per chain job
  • proof
  • Lemma 4: chain with no accelerator segment choi2021picas
  • Theorem 1: PAAM
  • proof