Table of Contents
Fetching ...

Adaptive Estimation of the Transition Density of Controlled Markov Chains

Imon Banerjee, Vinayak Rao, Harsha Honnappa

TL;DR

This method builds upon recent advances in adaptive density estimation by selecting an estimator that minimizes a loss function and fitting the observed data well, using a constrained minimax criterion over a dense class of estimators.

Abstract

Estimating the transition dynamics of controlled Markov chains is crucial in fields such as time series analysis, reinforcement learning, and system exploration. Traditional non-parametric density estimation methods often assume independent samples and require oracle knowledge of smoothness parameters like the Hölder continuity coefficient. These assumptions are unrealistic in controlled Markovian settings, especially when the controls are non-Markovian, since such parameters need to hold uniformly over all control values. To address this gap, we propose an adaptive estimator for the transition densities of controlled Markov chains that does not rely on prior knowledge of smoothness parameters or assumptions about the control sequence distribution. Our method builds upon recent advances in adaptive density estimation by selecting an estimator that minimizes a loss function {and} fitting the observed data well, using a constrained minimax criterion over a dense class of estimators. We validate the performance of our estimator through oracle risk bounds, employing both randomized and deterministic versions of the Hellinger distance as loss functions. This approach provides a robust and flexible framework for estimating transition densities in controlled Markovian systems without imposing strong assumptions.

Adaptive Estimation of the Transition Density of Controlled Markov Chains

TL;DR

This method builds upon recent advances in adaptive density estimation by selecting an estimator that minimizes a loss function and fitting the observed data well, using a constrained minimax criterion over a dense class of estimators.

Abstract

Estimating the transition dynamics of controlled Markov chains is crucial in fields such as time series analysis, reinforcement learning, and system exploration. Traditional non-parametric density estimation methods often assume independent samples and require oracle knowledge of smoothness parameters like the Hölder continuity coefficient. These assumptions are unrealistic in controlled Markovian settings, especially when the controls are non-Markovian, since such parameters need to hold uniformly over all control values. To address this gap, we propose an adaptive estimator for the transition densities of controlled Markov chains that does not rely on prior knowledge of smoothness parameters or assumptions about the control sequence distribution. Our method builds upon recent advances in adaptive density estimation by selecting an estimator that minimizes a loss function {and} fitting the observed data well, using a constrained minimax criterion over a dense class of estimators. We validate the performance of our estimator through oracle risk bounds, employing both randomized and deterministic versions of the Hellinger distance as loss functions. This approach provides a robust and flexible framework for estimating transition densities in controlled Markovian systems without imposing strong assumptions.

Paper Structure

This paper contains 64 sections, 32 theorems, 294 equations.

Key Result

Proposition 1

For a given transition function $s$, for any partition $m$, the associated estimator $\hat{s}_m$ satisfies

Theorems & Definitions (77)

  • Remark 1
  • Proposition 1
  • Remark 2
  • Definition 1
  • Remark 3
  • Theorem 1
  • proof
  • Proposition 2
  • Proposition 3
  • Remark 4
  • ...and 67 more