Deep multitask neural networks for solving some stochastic optimal control problems
Christian Yeo
TL;DR
Addresses the challenge of solving BDPP-based stochastic optimal control problems when the state distribution is unknown and state simulations are infeasible. Proposes a deep multitask neural network per date with a shared feature extractor and task-specific heads to learn all bang-bang decisions simultaneously, aided by a novel loss weighting scheme called Sigmoid-Moving Average GradNorm (S-MAG) to balance learning across many tasks. The approach is validated on commodity derivative problems (Take-or-Pay and swing contracts) in one- and three-factor models, where it outperforms state-of-the-art BDPP-based methods and the Longstaff-Schwartz approach. The work demonstrates a scalable, data-efficient framework for BDPP-type SOC problems with potential applicability beyond finance to other domains requiring dynamic programming under uncertainty.
Abstract
Most existing neural network-based approaches for solving stochastic optimal control problems using the associated backward dynamic programming principle rely on the ability to simulate the underlying state variables. However, in some problems, this simulation is infeasible, leading to the discretization of state variable space and the need to train one neural network for each data point. This approach becomes computationally inefficient when dealing with large state variable spaces. In this paper, we consider a class of this type of stochastic optimal control problems and introduce an effective solution employing multitask neural networks. To train our multitask neural network, we introduce a novel scheme that dynamically balances the learning across tasks. Through numerical experiments on real-world derivatives pricing problems, we prove that our method outperforms state-of-the-art approaches.
