Table of Contents
Fetching ...

Entropy-Regularized Mean-Variance Portfolio Optimization with Jumps

Christian Bender, Nguyen Tran Thuan

TL;DR

This paper addresses risk-aware portfolio optimization under jumps by introducing entropy-regularized exploratory controls. It constructs a continuous-time exploratory SDE with Lévy jumps from a discrete-time randomized-control scheme and proves the optimal distributional control is Gaussian, yielding a linear, closed-form wealth SDE. The analysis combines a dynamic-programming/HJB-PIDE framework with a quadratic ansatz to obtain explicit forms for the optimal control and the Lagrange multiplier, and it characterizes the wealth dynamics across multidimensional jump-diffusion settings. A key technical contribution is the weak convergence of discrete-time integrators to a limit SPDE-driven dynamics, which provides a rigorous basis for the RL-inspired exploration in continuous time and offers practical formulas for implementing exploration-regularized MV strategies in jump settings.

Abstract

Motivated by the trade-off between exploitation and exploration in reinforcement learning, we study a continuous-time entropy-regularized mean variance portfolio selection problem in the presence of jumps. We propose an exploratory SDE for the wealth process associated with multiple risky assets which exhibit Lévy jumps. In contrast to the existing literature, we study the limiting behavior of the natural discrete-time formulation of the wealth process associated to a randomized control in order to derive the continuous-time dynamics. We then show that an optimal distributional control of the continuous-time entropy-regularized exploratory mean-variance problem is Gaussian. The respective optimal wealth process solves a linear SDE whose representation is explicitly obtained.

Entropy-Regularized Mean-Variance Portfolio Optimization with Jumps

TL;DR

This paper addresses risk-aware portfolio optimization under jumps by introducing entropy-regularized exploratory controls. It constructs a continuous-time exploratory SDE with Lévy jumps from a discrete-time randomized-control scheme and proves the optimal distributional control is Gaussian, yielding a linear, closed-form wealth SDE. The analysis combines a dynamic-programming/HJB-PIDE framework with a quadratic ansatz to obtain explicit forms for the optimal control and the Lagrange multiplier, and it characterizes the wealth dynamics across multidimensional jump-diffusion settings. A key technical contribution is the weak convergence of discrete-time integrators to a limit SPDE-driven dynamics, which provides a rigorous basis for the RL-inspired exploration in continuous time and offers practical formulas for implementing exploration-regularized MV strategies in jump settings.

Abstract

Motivated by the trade-off between exploitation and exploration in reinforcement learning, we study a continuous-time entropy-regularized mean variance portfolio selection problem in the presence of jumps. We propose an exploratory SDE for the wealth process associated with multiple risky assets which exhibit Lévy jumps. In contrast to the existing literature, we study the limiting behavior of the natural discrete-time formulation of the wealth process associated to a randomized control in order to derive the continuous-time dynamics. We then show that an optimal distributional control of the continuous-time entropy-regularized exploratory mean-variance problem is Gaussian. The respective optimal wealth process solves a linear SDE whose representation is explicitly obtained.
Paper Structure (37 sections, 12 theorems, 190 equations)

This paper contains 37 sections, 12 theorems, 190 equations.

Key Result

Proposition 3.2

For $n \ge 1$, $1 \le i \le n$, there exist (uniquely up to a $\mathbb P$-null set) a random vector $\mu^H_{n, i-1}$ and a random matrix $\vartheta^H_{n, i-1} \in \mathbb S^D_{++}$, both are $\mathcal{F}_{n, i-1}$-measurable and square integrable, and a square integrable random vector $\eta^H_{n, i} such that

Theorems & Definitions (38)

  • Proposition 3.2
  • proof
  • Remark 3.4
  • Theorem 3.5
  • Remark 3.6
  • Remark 3.7
  • Remark 3.8
  • Definition 4.1: Admissible control
  • Remark 4.2
  • Definition 4.4
  • ...and 28 more