Generation of Geodesics with Actor-Critic Reinforcement Learning to Predict Midpoints

Kazumi Kasaura

Generation of Geodesics with Actor-Critic Reinforcement Learning to Predict Midpoints

Kazumi Kasaura

TL;DR

This work addresses all-pairs geodesic generation on manifolds with infinitesimal metrics by learning to predict midpoints via an actor–critic framework. The midpoint tree (MT) recursively inserts predicted midpoints to construct geodesics, supported by functional equations that tie midpoint predictions to distance estimates $V$, culminating in convergence to true geodesics under mild assumptions. Theoretical results cover the midpoint and distance consistency, iteration, Finsler extensions, and free-space handling with obstacle penalties. Empirically, MT demonstrates competitive performance across five path-planning tasks with diverse metrics and constraints, highlighting improved sample efficiency and path quality, especially under hard planning tasks and complex kinematics. The approach offers a scalable, metric-agnostic route to geodesic generation in robotics and geometric optimization contexts, with potential extensions to off-policy learning and environment-conditioned policies.

Abstract

To find the shortest paths for all pairs on manifolds with infinitesimally defined metrics, we introduce a framework to generate them by predicting midpoints recursively. To learn midpoint prediction, we propose an actor-critic approach. We prove the soundness of our approach and show experimentally that the proposed method outperforms existing methods on several planning tasks, including path planning for agents with complex kinematics and motion planning for multi-degree-of-freedom robot arms.

Generation of Geodesics with Actor-Critic Reinforcement Learning to Predict Midpoints

TL;DR

, culminating in convergence to true geodesics under mild assumptions. Theoretical results cover the midpoint and distance consistency, iteration, Finsler extensions, and free-space handling with obstacle penalties. Empirically, MT demonstrates competitive performance across five path-planning tasks with diverse metrics and constraints, highlighting improved sample efficiency and path quality, especially under hard planning tasks and complex kinematics. The approach offers a scalable, metric-agnostic route to geodesic generation in robotics and geometric optimization contexts, with potential extensions to off-policy learning and environment-conditioned policies.

Abstract

Paper Structure (35 sections, 6 theorems, 78 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 35 sections, 6 theorems, 78 equations, 7 figures, 3 tables, 1 algorithm.

Introduction
Related Works
Path Planning with Reinforcement Learning
Goal-Conditioned Reinforcement Learning and Sub-Goals
Preliminaries and Notation
Quasi-Metric Space
Finsler Geometry
Theoretical Results
Midpoint Tree
Functional Equation
Iteration
Finsler Case
Free Space
Experiments
Learning Algorithm
...and 20 more sections

Key Result

Proposition 2

Assume that $(X,d)$ has the midpoint property and $\pi$ and $V$ satisfy (eq:Vpi), (eq:piargmin), and (eq:dpi). Assume also that $\pi$ is uniformly continuous. Let $\varepsilon \in (0, 1/9)$. If there exists $\delta>0$ such that, $V$ approximates $d$ locally for pairs with distances less than $\delta then $V$ approximates $d$ globally, which means that In particular, if such $\delta>0$ exists for

Figures (7)

Figure 1: Midpoint tree generation of a geodesic (dotted curve).
Figure 2: Positions of $x,y,z,w,m,p$.
Figure 3: Example of switching geodesics.
Figure 4: Success rate plots.
Figure 5: Examples of generated paths.
...and 2 more figures

Theorems & Definitions (29)

Example 1
Example 2
Remark 1
Proposition 2
proof
Remark 3
Remark 4
Proposition 6
Example 3
Example 4
...and 19 more

Generation of Geodesics with Actor-Critic Reinforcement Learning to Predict Midpoints

TL;DR

Abstract

Generation of Geodesics with Actor-Critic Reinforcement Learning to Predict Midpoints

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (29)