Table of Contents
Fetching ...

On the Ziv-Merhav theorem beyond Markovianity II: leveraging the thermodynamic formalism

Nicholas Barnfield, Raphaël Grondin, Gaia Pozzoli, Renaud Raquépas

Abstract

We prove asymptotic results for a modification of the cross-entropy estimator originally introduced by Ziv and Merhav in the Markovian setting in 1993. Our results concern a more general class of decoupled measures on shift spaces over a finite alphabet and in particular imply strong asymptotic consistency of the modified estimator for all pairs of functions of stationary, irreducible, finite-state Markov chains satisfying a mild decay condition. Our approach is based on the study of a rescaled cumulant-generating function called the cross-entropic pressure, importing to information theory some techniques from the study of large deviations within the thermodynamic formalism.

On the Ziv-Merhav theorem beyond Markovianity II: leveraging the thermodynamic formalism

Abstract

We prove asymptotic results for a modification of the cross-entropy estimator originally introduced by Ziv and Merhav in the Markovian setting in 1993. Our results concern a more general class of decoupled measures on shift spaces over a finite alphabet and in particular imply strong asymptotic consistency of the modified estimator for all pairs of functions of stationary, irreducible, finite-state Markov chains satisfying a mild decay condition. Our approach is based on the study of a rescaled cumulant-generating function called the cross-entropic pressure, importing to information theory some techniques from the study of large deviations within the thermodynamic formalism.
Paper Structure (15 sections, 14 theorems, 85 equations, 3 figures)

This paper contains 15 sections, 14 theorems, 85 equations, 3 figures.

Key Result

Theorem 2.6

Suppose that $\mathbf{P}$ satisfies it:SLD and it:UD, that $\mathbf{Q}$ satisfies it:UD, and that it:ND holds. If, in addition $\mathop{\mathrm{supp}}\nolimits \mathbf{Q} \subseteq \mathop{\mathrm{supp}}\nolimits \mathbf{P}$, then for $(\mathbf{P}\otimes\mathbf{Q})$-almost every $(x,y)$.

Figures (3)

  • Figure 1: The dashed line has a slope that is slightly less steep than the least steep slope $D_-\bar{q}(0)$ in the subdifferential of the convex function $\bar{q}$ at the origin, so we can find a small region to the left of the origin where the graph of $\bar{q}$ lies below it.
  • Figure 2: The convergence of estimators to $h_{\textnormal{c}}(\mathbf{Q}|\mathbf{P})$ is illustrated in a numerical experiment. Namely, $Q_N$ is the mZM estimator introduced herein, $\tilde{Q}_N$ is the mZM estimator without the $-c_N$ "correction" in the denominator, $Q_N^{\textnormal{ZM}}$ is the original ZM estimator presented in BGPR and MZ93, and $\ln N / \Lambda_N$ is the longest-match length estimator, which has been shown to be asymptotically consistent under weaker assumptions Ko98CDEJR23w. They are compared to the sequence in Lemma \ref{['lem:cross-SMB-gap']}, which is computable here as we know the marginals of the measure $\mathbf{P}$, which is of course not the case in practical applications. Here, both $\mathbf{Q}$ and $\mathbf{P}$ are stationary HMMs on $\{\mathsf{0},\mathsf{1}\}^{\mathbf N}$.
  • Figure 3: The convergence of various estimators to $h_{\textnormal{c}}(\mathbf{Q}|\mathbf{P})$ is illustrated in more numerical experiments, subject to the same legend as Figure \ref{['fig:hmm0']}: top left, both $\mathbf{Q}$ and $\mathbf{P}$ are the HMM given in Example 4.5 of BGPR, where \ref{['eq:add-letter']} does not hold; top right, $\mathbf{Q}$ is fully supported Bernoulli measure on $\{\mathsf{0},\mathsf{1},\mathsf{2}\}^{\mathbf N}$ and $\mathbf{P}$ is a HMM on $\Omega$ with $\mathop{\mathrm{supp}}\nolimits \mathbf{P} \subsetneq \{\mathsf{0},\mathsf{1},\mathsf{2}\}^{\mathbf N}$; bottom left, $\mathbf{P}$ is a Markov measure and $\mathbf{Q}$ a HMM on $\{\mathsf{0},\mathsf{1},\mathsf{2}\}^{\mathbf N}$; bottom right, $\mathbf{Q}$ and $\mathbf{P}$ both come from the so-called "Keep-Switch instrument" --- which belongs to the family of HMMs --- with parameters $(q_1, q_2) = (\tfrac{1}{2},\tfrac{1}{4})$; see BCJPPExamples.

Theorems & Definitions (40)

  • Definition 2.1
  • Example 2.2
  • Remark 2.3
  • Remark 2.4
  • Remark 2.5
  • Theorem 2.6
  • Remark 2.7
  • Remark 2.8
  • Corollary 2.9
  • proof
  • ...and 30 more