Signatures Meet Dynamic Programming: Generalizing Bellman Equations for Trajectory Following
Motoya Ohnishi, Iretiayo Akinola, Jie Xu, Ajay Mandlekar, Fabio Ramos
TL;DR
This work introduces signature control, a framework that generalizes dynamic programming from states to entire trajectories by leveraging path signatures. By reformulating DP with the path-to-go $S$-function and using Chen’s identity, the authors show how a trajectory-centered backup subsumes and extends classical Bellman updates, enabling time-step adaptivity and robustness to model misspecification. The framework is instantiated as Signature MPC, which optimizes over the signature of the full path using a receding-horizon surrogate cost and a terminal $S$-function, with empirical validation on simple and robotic tasks demonstrating improved tracking accuracy and disturbance robustness. The approach offers a principled way to encode rich geometric information of trajectories, potentially improving data efficiency and resilience in control and RL settings, while highlighting avenues for theory and real-time deployment enhancements.
Abstract
Path signatures have been proposed as a powerful representation of paths that efficiently captures the path's analytic and geometric characteristics, having useful algebraic properties including fast concatenation of paths through tensor products. Signatures have recently been widely adopted in machine learning problems for time series analysis. In this work we establish connections between value functions typically used in optimal control and intriguing properties of path signatures. These connections motivate our novel control framework with signature transforms that efficiently generalizes the Bellman equation to the space of trajectories. We analyze the properties and advantages of the framework, termed signature control. In particular, we demonstrate that (i) it can naturally deal with varying/adaptive time steps; (ii) it propagates higher-level information more efficiently than value function updates; (iii) it is robust to dynamical system misspecification over long rollouts. As a specific case of our framework, we devise a model predictive control method for path tracking. This method generalizes integral control, being suitable for problems with unknown disturbances. The proposed algorithms are tested in simulation, with differentiable physics models including typical control and robotics tasks such as point-mass, curve following for an ant model, and a robotic manipulator.
