Table of Contents
Fetching ...

Pathwise Relaxed Optimal Control of Rough Differential Equations

Prakash Chakraborty, Harsha Honnappa, Samy Tindel

TL;DR

The paper develops a rigorous framework for pathwise relaxed optimal control of rough differential equations, introducing a rough HJB equation and a precise notion of rough viscosity solutions. It builds the rough-path toolkit (weakly/strongly controlled paths, sewing) and proves existence and, under stronger regularity, uniqueness for RDEs with drift, including measure-valued controls. A dynamic-programming based rough HJB is derived, and the value function is shown to be a rough viscosity solution, with a flow-based approach used to establish uniqueness and stability. The work also proves that smooth approximations converge to the rough solution, laying theoretical groundwork for reinforcement learning in noisy, non-Markovian environments and enabling exploration through entropy-regularized rewards.

Abstract

This note lays part of the theoretical ground for a definition of differential systems modeling reinforcement learning in continuous time non-Markovian rough environments. Specifically we focus on optimal relaxed control of rough equations (the term relaxed referring to the fact that controls have to be considered as measure valued objects). With reinforcement learning in view, our reward functions encompass forms that involve an entropy-type term favoring exploration. In this context, our contribution focuses on a careful definition of the corresponding relaxed Hamilton-Jacobi-Bellman (HJB)-type equation. A substantial part of our endeavor consists in a precise definition of the notion of test function and viscosity solution for the rough relaxed PDE obtained in this framework. Note that this task is often merely sketched in the rough viscosity literature, in spite of the fact that it gives a proper meaning to the differential system at stake. In the last part of the paper we prove that the natural value function in our context solves a relaxed rough HJB equation in the viscosity sense.

Pathwise Relaxed Optimal Control of Rough Differential Equations

TL;DR

The paper develops a rigorous framework for pathwise relaxed optimal control of rough differential equations, introducing a rough HJB equation and a precise notion of rough viscosity solutions. It builds the rough-path toolkit (weakly/strongly controlled paths, sewing) and proves existence and, under stronger regularity, uniqueness for RDEs with drift, including measure-valued controls. A dynamic-programming based rough HJB is derived, and the value function is shown to be a rough viscosity solution, with a flow-based approach used to establish uniqueness and stability. The work also proves that smooth approximations converge to the rough solution, laying theoretical groundwork for reinforcement learning in noisy, non-Markovian environments and enabling exploration through entropy-regularized rewards.

Abstract

This note lays part of the theoretical ground for a definition of differential systems modeling reinforcement learning in continuous time non-Markovian rough environments. Specifically we focus on optimal relaxed control of rough equations (the term relaxed referring to the fact that controls have to be considered as measure valued objects). With reinforcement learning in view, our reward functions encompass forms that involve an entropy-type term favoring exploration. In this context, our contribution focuses on a careful definition of the corresponding relaxed Hamilton-Jacobi-Bellman (HJB)-type equation. A substantial part of our endeavor consists in a precise definition of the notion of test function and viscosity solution for the rough relaxed PDE obtained in this framework. Note that this task is often merely sketched in the rough viscosity literature, in spite of the fact that it gives a proper meaning to the differential system at stake. In the last part of the paper we prove that the natural value function in our context solves a relaxed rough HJB equation in the viscosity sense.
Paper Structure (15 sections, 23 theorems, 227 equations)

This paper contains 15 sections, 23 theorems, 227 equations.

Key Result

Proposition 2.2

Let $h \in \mathcal{C}_3^{\mu}(V)$ for $\mu > 1$ be such that $\delta h = 0$. Then there exists a unique $g = \Lambda(h) \in \mathcal{C}_2^{\mu}(V)$ such that $\delta g = h$. Furthermore for such an $h$, the following relations hold true:

Theorems & Definitions (61)

  • Remark 2.1
  • Proposition 2.2
  • Proposition 2.3
  • Remark 2.5
  • Remark 2.6
  • Definition 2.7
  • Remark 2.8
  • Proposition 2.9
  • proof
  • Proposition 2.11
  • ...and 51 more