Generation of Geodesics with Actor-Critic Reinforcement Learning to Predict Midpoints
Kazumi Kasaura
TL;DR
This work addresses all-pairs geodesic generation on manifolds with infinitesimal metrics by learning to predict midpoints via an actor–critic framework. The midpoint tree (MT) recursively inserts predicted midpoints to construct geodesics, supported by functional equations that tie midpoint predictions to distance estimates $V$, culminating in convergence to true geodesics under mild assumptions. Theoretical results cover the midpoint and distance consistency, iteration, Finsler extensions, and free-space handling with obstacle penalties. Empirically, MT demonstrates competitive performance across five path-planning tasks with diverse metrics and constraints, highlighting improved sample efficiency and path quality, especially under hard planning tasks and complex kinematics. The approach offers a scalable, metric-agnostic route to geodesic generation in robotics and geometric optimization contexts, with potential extensions to off-policy learning and environment-conditioned policies.
Abstract
To find the shortest paths for all pairs on manifolds with infinitesimally defined metrics, we introduce a framework to generate them by predicting midpoints recursively. To learn midpoint prediction, we propose an actor-critic approach. We prove the soundness of our approach and show experimentally that the proposed method outperforms existing methods on several planning tasks, including path planning for agents with complex kinematics and motion planning for multi-degree-of-freedom robot arms.
