Advancing Frontiers of Path Integral Theory for Stochastic Optimal Control
Apurva Patil
TL;DR
This work advances Path Integral Control as a scalable, simulator-driven framework for Stochastic Optimal Control by formulating and solving six SOC classes, including Chance-Constrained SOC and Two-Player Stochastic Differential Games. It leverages the Feynman-Kac representation to transform nonlinear HJB/HJI PDEs into tractable path-integral expressions, enabling real-time policy synthesis via Monte Carlo rollouts that are highly amenable to GPU parallelization. A unifying duality-based approach is developed for CC-SOC, providing strong duality under certain assumptions and offering both PDE-based and importance-sampling risk estimations; a dual ascent algorithm enables online, iterative satisfaction of chance constraints. The dissertation further extends path-integral methods to hierarchical task control, deception and deception-related policy synthesis, and stealthy attack synthesis with corresponding mitigation strategies, linking to risk-sensitive control and robust game-theoretic formulations. The resulting framework supports simulator-driven autonomy in complex, uncertain environments and offers practical tools, including an open-source library, for real-time, high-dimensional control under uncertainty and adversarial conditions.
Abstract
Stochastic Optimal Control (SOC) problems arise in systems influenced by uncertainty, such as autonomous robots or financial models. Traditional methods like dynamic programming are often intractable for high-dimensional, nonlinear systems due to the curse of dimensionality. This dissertation explores the path integral control framework as a scalable, sampling-based alternative. By reformulating SOC problems as expectations over stochastic trajectories, it enables efficient policy synthesis via Monte Carlo sampling and supports real-time implementation through GPU parallelization. We apply this framework to six classes of SOC problems: Chance-Constrained SOC, Stochastic Differential Games, Deceptive Control, Task Hierarchical Control, Risk Mitigation of Stealthy Attacks, and Discrete-Time LQR. A sample complexity analysis for the discrete-time case is also provided. These contributions establish a foundation for simulator-driven autonomy in complex, uncertain environments.
