Partially observed controlled Markov chains and optimal control of the Wonham filter

Fulvia Confortola; Marco Fuhrman

Partially observed controlled Markov chains and optimal control of the Wonham filter

Fulvia Confortola, Marco Fuhrman

TL;DR

This work addresses optimal control of finite-state Markov chains under partial observation corrupted by Brownian noise. It constructs a controlled jump process with stochastic transition rates, then reframes the problem as a separated control problem driven by the Wonham filter via a Girsanov change of measure, yielding a fully observed dynamic on the filter state ρ. The paper proves that the separated problem is equivalent to the original problem and provides a comprehensive analysis: existence and properties of the Wonham-filter dynamics, viscosity-solution characterizations of the value function for both infinite and finite horizons with comparison and verification theorems, and a stochastic maximum principle that applies under broad conditions without convexity of the action set. Together, these results yield a robust framework for partially observed control in finite-state settings with practical implications for filtering-based control and decision-making under uncertainty.

Abstract

We consider a class of optimal control problems, with finite or infinite horizon, for a continuous-time Markov chain with finite state space. In this case, the control process affects the transition rates. We suppose that the controlled process can not be observed, and at any time the control actions are chosen based on the observation of a related stochastic process perturbed by an exogenous Brownian motion. We describe a construction of the controlled Markov chain, having stochastic transition rates adapted to the observation filtration. By a change of probability measure of Girsanov type, we introduce the so-called separated optimal control problem, where the state is the conditional (unnormalized) distribution of the controlled Markov chain and the observation process becomes a driving Brownian motion, and we prove the equivalence with the original control problem. The controlled equations for the separated problem are an instance of the Wonham filtering equations. Next we present an analysis of the separated problem: we characterize the value function as the unique viscosity solution to the dynamic programming equations (both in the parabolic and the elliptic case) we prove verifications theorems and a version of the stochastic maximum principle in the form of a necessary conditions for optimality.

Partially observed controlled Markov chains and optimal control of the Wonham filter

TL;DR

Abstract

Paper Structure (11 sections, 11 theorems, 171 equations)

This paper contains 11 sections, 11 theorems, 171 equations.

Introduction
A construction of a point process with random compensator
The partially observed control problem and its reformulations
The partially observed control problem for the reference probability
The partially observed control problem for the physical probability
The separated optimal control problem
Optimal control of the Wonham filter: setting and preliminary results
Dynamic programming equation for infinite horizon: viscosity theory
Dynamic programming equation for finite horizon: viscosity theory
Dynamic programming equation: verification theorems
Stochastic maximum principle

Key Result

Lemma 2.1

The process $X^q$ is càdlàg and $\mathbb{F}^{\bar{N},1}$-adapted.

Theorems & Definitions (15)

Lemma 2.1
Theorem 2.1
Remark 3.1
Proposition 3.1
Remark 3.2
Theorem 4.1
Theorem 4.2
Theorem 5.1
Theorem 5.2
Theorem 6.1
...and 5 more

Partially observed controlled Markov chains and optimal control of the Wonham filter

TL;DR

Abstract

Partially observed controlled Markov chains and optimal control of the Wonham filter

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (15)