Table of Contents
Fetching ...

Refined Statistical Bounds for Classification Error Mismatches with Constrained Bayes Error

Zijian Yang, Vahe Eminyan, Ralf Schlüter, Hermann Ney

TL;DR

The classification error bound is derived to study the relationship between the Kullback-Leibler divergence and the classification error mismatch to derive a refined Kullback-Leibler-divergence-based bound on the error mismatch with the constraint that the Bayes error is lower than a threshold.

Abstract

In statistical classification/multiple hypothesis testing and machine learning, a model distribution estimated from the training data is usually applied to replace the unknown true distribution in the Bayes decision rule, which introduces a mismatch between the Bayes error and the model-based classification error. In this work, we derive the classification error bound to study the relationship between the Kullback-Leibler divergence and the classification error mismatch. We first reconsider the statistical bounds based on classification error mismatch derived in previous works, employing a different method of derivation. Then, motivated by the observation that the Bayes error is typically low in machine learning tasks like speech recognition and pattern recognition, we derive a refined Kullback-Leibler-divergence-based bound on the error mismatch with the constraint that the Bayes error is lower than a threshold.

Refined Statistical Bounds for Classification Error Mismatches with Constrained Bayes Error

TL;DR

The classification error bound is derived to study the relationship between the Kullback-Leibler divergence and the classification error mismatch to derive a refined Kullback-Leibler-divergence-based bound on the error mismatch with the constraint that the Bayes error is lower than a threshold.

Abstract

In statistical classification/multiple hypothesis testing and machine learning, a model distribution estimated from the training data is usually applied to replace the unknown true distribution in the Bayes decision rule, which introduces a mismatch between the Bayes error and the model-based classification error. In this work, we derive the classification error bound to study the relationship between the Kullback-Leibler divergence and the classification error mismatch. We first reconsider the statistical bounds based on classification error mismatch derived in previous works, employing a different method of derivation. Then, motivated by the observation that the Bayes error is typically low in machine learning tasks like speech recognition and pattern recognition, we derive a refined Kullback-Leibler-divergence-based bound on the error mismatch with the constraint that the Bayes error is lower than a threshold.
Paper Structure (10 sections, 5 theorems, 47 equations, 1 figure)

This paper contains 10 sections, 5 theorems, 47 equations, 1 figure.

Key Result

Theorem 1

The local $f$-divergence between $pr(c|x)$ and $q(c|x)$ is tightly lower-bounded by a function of the local error mismatch $\Delta_q(x)$ in the following way:

Figures (1)

  • Figure 1: Comparison of the Nussbaum bound nussbaum2013relative and the refined bound in this paper. The simulations in the upper figure are under the constraint $E_* \leq 0.08$. The grey dots refer to the simulation points.

Theorems & Definitions (9)

  • Theorem 1
  • proof
  • Lemma 1
  • Theorem 2
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • proof : Proof of Theorem \ref{['theorem:refinedbound']}