Table of Contents
Fetching ...

Optimal control under unknown intensity with Bayesian learning

Nicolas Baradel, Quentin Cormier

TL;DR

This work tackles the problem of optimally controlling a Poisson-driven system with an unknown intensity by embedding Bayesian filtering into stochastic control. A Girsanov-based reformulation converts the problem into a tractable dynamic program, and under linear-in-λ intensity, a finite-dimensional reduction yields a finite-dimensional HJB equation whose unique viscosity solution characterizes the value function. The authors establish Lipschitz regularity, dynamic programming principles for deterministic and stochastic stopping times, and a complete viscosity-solution framework including a comparison theorem. Numerical examples demonstrate the practical solvability and illustrate how posterior uncertainty evolves under optimal control, with implications for online learning and neuroscience applications.

Abstract

We investigate an optimal control problem motivated by neuroscience, where the dynamics is driven by a Poisson process with a controlled stochastic intensity and an unknown parameter. Given a prior distribution for the unknown parameter, we describe its evolution using Bayes' rule. We reformulate the optimization problem by applying Girsanov's theorem and establish a dynamic programming principle. Finally, we characterize the value function as the unique viscosity solution to a finite-dimensional Hamilton-Jacobi-Bellman equation, which can be solved numerically.

Optimal control under unknown intensity with Bayesian learning

TL;DR

This work tackles the problem of optimally controlling a Poisson-driven system with an unknown intensity by embedding Bayesian filtering into stochastic control. A Girsanov-based reformulation converts the problem into a tractable dynamic program, and under linear-in-λ intensity, a finite-dimensional reduction yields a finite-dimensional HJB equation whose unique viscosity solution characterizes the value function. The authors establish Lipschitz regularity, dynamic programming principles for deterministic and stochastic stopping times, and a complete viscosity-solution framework including a comparison theorem. Numerical examples demonstrate the practical solvability and illustrate how posterior uncertainty evolves under optimal control, with implications for online learning and neuroscience applications.

Abstract

We investigate an optimal control problem motivated by neuroscience, where the dynamics is driven by a Poisson process with a controlled stochastic intensity and an unknown parameter. Given a prior distribution for the unknown parameter, we describe its evolution using Bayes' rule. We reformulate the optimization problem by applying Girsanov's theorem and establish a dynamic programming principle. Finally, we characterize the value function as the unique viscosity solution to a finite-dimensional Hamilton-Jacobi-Bellman equation, which can be solved numerically.

Paper Structure

This paper contains 21 sections, 30 theorems, 144 equations, 2 figures.

Key Result

Proposition 2.3

Let $\gamma$ be a process such that $\mathbb{P}$ a.s., $\gamma \in L^2([0, T])$. In order for $\gamma$ to be $({\mathcal{F}}^N_t)$-predictable, it is necessary and sufficient that it admits the representation where for all $k \in \mathbb{N}$, the mapping $(\omega, t) \mapsto \Gamma_k(\omega)(t)$ is ${\mathcal{F}}^N_{\tau_k} \otimes {\mathcal{B}}([0, T])$-measurable, where ${\mathcal{B}}([0, T])$

Figures (2)

  • Figure 1: An optimal trajectory with intensity $\lambda \exp(2(y-1))$ and prior distribution $\frac{1}{10}\left(\delta_{0} + 2\delta_{0.25} + 4\delta_{0.5} + 2\delta_{0.75} + \delta_{1}\right)$. The true value is $\lambda = 1$.
  • Figure 2: An optimal trajectory with intensity approximating $\lambda \mathbf{1}_{\{y \geq 1\}}$ and prior $\mathcal{U}\left([0, 2]\right)$. The true value is $\lambda = 1$.

Theorems & Definitions (54)

  • Definition 2.2
  • Proposition 2.3
  • proof
  • Lemma 2.4
  • proof
  • Remark 2.5
  • Lemma 2.6
  • proof
  • Lemma 2.7
  • proof
  • ...and 44 more