Deep Learning for Continuous-time Stochastic Control with Jumps
Patrick Cheridito, Jean-Loup Dupret, Donatien Hainaut
TL;DR
The paper tackles finite-horizon continuous-time stochastic control with jumps by proposing two model-based neural-network algorithms that learn both the value function and optimal control. The GPI-PINN method uses a physics-informed residual loss with a gradient/Hessian–free trick to solve the HJB/PIDE, while GPI-CBU employs an expectation-free Bellman update to avoid jump-integrals and high-order derivatives, greatly improving scalability in high dimensions with jumps. Empirical results on high-dimensional LQR and consumption-investment problems show that GPI-CBU, in particular, achieves accurate value functions and policies up to 50–150 dimensions, often outperforming model-free RL baselines in jump-enabled settings. The work demonstrates how leveraging dynamics through model-based NN training yields globally applicable solutions over the space-time domain and offers a practical path for solving complex stochastic control problems in high dimensions.
Abstract
In this paper, we introduce a model-based deep-learning approach to solve finite-horizon continuous-time stochastic control problems with jumps. We iteratively train two neural networks: one to represent the optimal policy and the other to approximate the value function. Leveraging a continuous-time version of the dynamic programming principle, we derive two different training objectives based on the Hamilton-Jacobi-Bellman equation, ensuring that the networks capture the underlying stochastic dynamics. Empirical evaluations on different problems illustrate the accuracy and scalability of our approach, demonstrating its effectiveness in solving complex, high-dimensional stochastic control tasks.
