Table of Contents
Fetching ...

Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

Ivan Y. Tyukin, Tatiana Tyukina, Daniel van Helden, Zedong Zheng, Evgeny M. Mirkes, Oliver J. Sutton, Qinghua Zhou, Alexander N. Gorban, Penelope Allison

TL;DR

The paper addresses AI error handling by introducing weakly supervised AI error correctors with a priori, distribution-agnostic performance guarantees. It develops an algorithm that leverages low-dimensional projections of classifier representations to produce per-class acceptance and rejection bounds, expressed via functions $\rho$ and $\psi$, and enables abstention through thresholding. The approach is demonstrated on a pottery classification task with scarce labeled data, combining a deep core classifier with Fisher-discriminant-based correctors and showing improved conditional recall on accepted predictions alongside theoretical guarantees. This work provides a practical, provable framework for correcting AI errors without heavy reliance on distributional assumptions or large labeled datasets, with immediate implications for trust and safety in deployed AI systems.

Abstract

We present a new methodology for handling AI errors by introducing weakly supervised AI error correctors with a priori performance guarantees. These AI correctors are auxiliary maps whose role is to moderate the decisions of some previously constructed underlying classifier by either approving or rejecting its decisions. The rejection of a decision can be used as a signal to suggest abstaining from making a decision. A key technical focus of the work is in providing performance guarantees for these new AI correctors through bounds on the probabilities of incorrect decisions. These bounds are distribution agnostic and do not rely on assumptions on the data dimension. Our empirical example illustrates how the framework can be applied to improve the performance of an image classifier in a challenging real-world task where training data are scarce.

Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

TL;DR

The paper addresses AI error handling by introducing weakly supervised AI error correctors with a priori, distribution-agnostic performance guarantees. It develops an algorithm that leverages low-dimensional projections of classifier representations to produce per-class acceptance and rejection bounds, expressed via functions and , and enables abstention through thresholding. The approach is demonstrated on a pottery classification task with scarce labeled data, combining a deep core classifier with Fisher-discriminant-based correctors and showing improved conditional recall on accepted predictions alongside theoretical guarantees. This work provides a practical, provable framework for correcting AI errors without heavy reliance on distributional assumptions or large labeled datasets, with immediate implications for trust and safety in deployed AI systems.

Abstract

We present a new methodology for handling AI errors by introducing weakly supervised AI error correctors with a priori performance guarantees. These AI correctors are auxiliary maps whose role is to moderate the decisions of some previously constructed underlying classifier by either approving or rejecting its decisions. The rejection of a decision can be used as a signal to suggest abstaining from making a decision. A key technical focus of the work is in providing performance guarantees for these new AI correctors through bounds on the probabilities of incorrect decisions. These bounds are distribution agnostic and do not rely on assumptions on the data dimension. Our empirical example illustrates how the framework can be applied to improve the performance of an image classifier in a challenging real-world task where training data are scarce.
Paper Structure (18 sections, 2 theorems, 41 equations, 2 figures, 5 tables, 1 algorithm)

This paper contains 18 sections, 2 theorems, 41 equations, 2 figures, 5 tables, 1 algorithm.

Key Result

Theorem 1

Let the AI system $F$, feature map $\Phi$, and corrector training set $\mathcal{S}$ be defined as in eq:classifier:1--eq:training_set. Suppose that the elements of $\mathcal{S}$ are independently sampled from the (unknown) data distribution $P_D$, that $\mathcal{S}_{+,j}$ and $\mathcal{S}_{-,j}$ are and where $\rho,\psi:[0,1]\times\mathbb{N}\rightarrow \mathbb{R}$ are defined as

Figures (2)

  • Figure 1: Lower bound on the probability of correct rejection of errors, $P (\mathcal{A}(F (u), \Phi(u)) = \ell^{\times}| F (u) = \ell_j , \ \ell\neq\ell_j)$, provided in Theorem \ref{['thm:main_corrector_bounds']} and expressed as a function of the cardinality $M_{-,j}$ of the set $\mathcal{S}_{-,j}$ for different values of $\Delta_j=0.8,0.85,0.9,0.95$.
  • Figure 2: Examples of images of sherds from different classes. Each row corresponds to a class, with the top row showing images from class $1$, and the bottom row showing images from class $5$.

Theorems & Definitions (2)

  • Theorem 1
  • Lemma 1