Classification Error Bound for Low Bayes Error Conditions in Machine Learning

Zijian Yang; Vahe Eminyan; Ralf Schlüter; Hermann Ney

Classification Error Bound for Low Bayes Error Conditions in Machine Learning

Zijian Yang, Vahe Eminyan, Ralf Schlüter, Hermann Ney

TL;DR

This work analyzes how the mismatch between the true data distribution and a trained model affects classification error under low Bayes error conditions. It derives and refines classification error bounds using the Kullback–Leibler divergence, introducing a linear approximation for small Bayes error $E_*$, and extends these bounds to class priors and sequences. The authors connect these bounds to practical metrics such as cross-entropy loss, language-model perplexity, and word error rate, yielding analytic relationships that tighten our understanding of performance in tasks like automatic speech recognition. The results provide guidance on how small Bayes error constrains model error and informs CE-based training and evaluation in sequence modeling settings.

Abstract

In statistical classification and machine learning, classification error is an important performance measure, which is minimized by the Bayes decision rule. In practice, the unknown true distribution is usually replaced with a model distribution estimated from the training data in the Bayes decision rule. This substitution introduces a mismatch between the Bayes error and the model-based classification error. In this work, we apply classification error bounds to study the relationship between the error mismatch and the Kullback-Leibler divergence in machine learning. Motivated by recent observations of low model-based classification errors in many machine learning tasks, bounding the Bayes error to be lower, we propose a linear approximation of the classification error bound for low Bayes error conditions. Then, the bound for class priors are discussed. Moreover, we extend the classification error bound for sequences. Using automatic speech recognition as a representative example of machine learning applications, this work analytically discusses the correlations among different performance measures with extended bounds, including cross-entropy loss, language model perplexity, and word error rate.

Classification Error Bound for Low Bayes Error Conditions in Machine Learning

TL;DR

Abstract

Classification Error Bound for Low Bayes Error Conditions in Machine Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (1)