Table of Contents
Fetching ...

RaceMOP: Mapless Online Path Planning for Multi-Agent Autonomous Racing using Residual Policy Learning

Raphael Trumpp, Ehsan Javanmardi, Jin Nakazato, Manabu Tsukada, Marco Caccamo

TL;DR

RaceMOP is introduced, a novel method for mapless online path planning designed for multi-agent racing of F1TENTH cars that demonstrates superior handling over existing mapless planners and generalizes to unknown racetracks, affirming its potential for broader applications in robotics.

Abstract

The interactive decision-making in multi-agent autonomous racing offers insights valuable beyond the domain of self-driving cars. Mapless online path planning is particularly of practical appeal but poses a challenge for safely overtaking opponents due to the limited planning horizon. To address this, we introduce RaceMOP, a novel method for mapless online path planning designed for multi-agent racing of F1TENTH cars. Unlike classical planners that rely on predefined racing lines, RaceMOP operates without a map, utilizing only local observations to execute high-speed overtaking maneuvers. Our approach combines an artificial potential field method as a base policy with residual policy learning to enable long-horizon planning. We advance the field by introducing a novel approach for policy fusion with the residual policy directly in probability space. Extensive experiments on twelve simulated racetracks validate that RaceMOP is capable of long-horizon decision-making with robust collision avoidance during overtaking maneuvers. RaceMOP demonstrates superior handling over existing mapless planners and generalizes to unknown racetracks, affirming its potential for broader applications in robotics. Our code is available at http://github.com/raphajaner/racemop.

RaceMOP: Mapless Online Path Planning for Multi-Agent Autonomous Racing using Residual Policy Learning

TL;DR

RaceMOP is introduced, a novel method for mapless online path planning designed for multi-agent racing of F1TENTH cars that demonstrates superior handling over existing mapless planners and generalizes to unknown racetracks, affirming its potential for broader applications in robotics.

Abstract

The interactive decision-making in multi-agent autonomous racing offers insights valuable beyond the domain of self-driving cars. Mapless online path planning is particularly of practical appeal but poses a challenge for safely overtaking opponents due to the limited planning horizon. To address this, we introduce RaceMOP, a novel method for mapless online path planning designed for multi-agent racing of F1TENTH cars. Unlike classical planners that rely on predefined racing lines, RaceMOP operates without a map, utilizing only local observations to execute high-speed overtaking maneuvers. Our approach combines an artificial potential field method as a base policy with residual policy learning to enable long-horizon planning. We advance the field by introducing a novel approach for policy fusion with the residual policy directly in probability space. Extensive experiments on twelve simulated racetracks validate that RaceMOP is capable of long-horizon decision-making with robust collision avoidance during overtaking maneuvers. RaceMOP demonstrates superior handling over existing mapless planners and generalizes to unknown racetracks, affirming its potential for broader applications in robotics. Our code is available at http://github.com/raphajaner/racemop.
Paper Structure (39 sections, 12 equations, 5 figures, 4 tables)

This paper contains 39 sections, 12 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Our method, named RaceMOP, is a novel mapless online path planner for multi-agent racing that uses only local observations. This method fuses an APF planner with a learned residual policy for simulated F1TENTH cars.
  • Figure 2: The ego vehicle (blue) perceives the environment by LiDAR points; green points show the wall (black line). When another vehicle (red) is present, its shape is approximately reflected by LiDAR points (red), but parts of the wall get occluded (red area). After filtering the points, the gap behind the ego vehicle is closed with artificial points (orange).
  • Figure 3: Only a subset of LiDAR points (blue) closest to the ego vehicle's edge points (black) are considered as repulsive forces (blue arrows) for the apf, while the goal point (purple) is attractive. The calculated path (light gray) is smoothed (dark gray) to avoid abrupt direction changes. The tracking point (brown) is found in a fixed lookahead distance.
  • Figure 4: RaceMOP's architecture combines a base policy with a residual policy to learn the parameters of a probability distribution $\mathcal{N}$ from only local observations $s_t=\{L_t, s'_{t}\}$ that contains a history of $n_f$ frames.
  • Figure 5: Exemplary overtaking maneuvers of RaceMOP for five different, replicated real-world racetracks where the ego vehicle (blue, full line) overtakes the opponent (red, dashed line), showing various strategic behaviors. Discrete timesteps $t_1,..., t_7$ of the vehicle's pose are given every 0.5 $\s$.