Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

Ivan Y. Tyukin; Tatiana Tyukina; Daniel van Helden; Zedong Zheng; Evgeny M. Mirkes; Oliver J. Sutton; Qinghua Zhou; Alexander N. Gorban; Penelope Allison

Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

Ivan Y. Tyukin, Tatiana Tyukina, Daniel van Helden, Zedong Zheng, Evgeny M. Mirkes, Oliver J. Sutton, Qinghua Zhou, Alexander N. Gorban, Penelope Allison

TL;DR

The paper addresses AI error handling by introducing weakly supervised AI error correctors with a priori, distribution-agnostic performance guarantees. It develops an algorithm that leverages low-dimensional projections of classifier representations to produce per-class acceptance and rejection bounds, expressed via functions $\rho$ and $\psi$, and enables abstention through thresholding. The approach is demonstrated on a pottery classification task with scarce labeled data, combining a deep core classifier with Fisher-discriminant-based correctors and showing improved conditional recall on accepted predictions alongside theoretical guarantees. This work provides a practical, provable framework for correcting AI errors without heavy reliance on distributional assumptions or large labeled datasets, with immediate implications for trust and safety in deployed AI systems.

Abstract

We present a new methodology for handling AI errors by introducing weakly supervised AI error correctors with a priori performance guarantees. These AI correctors are auxiliary maps whose role is to moderate the decisions of some previously constructed underlying classifier by either approving or rejecting its decisions. The rejection of a decision can be used as a signal to suggest abstaining from making a decision. A key technical focus of the work is in providing performance guarantees for these new AI correctors through bounds on the probabilities of incorrect decisions. These bounds are distribution agnostic and do not rely on assumptions on the data dimension. Our empirical example illustrates how the framework can be applied to improve the performance of an image classifier in a challenging real-world task where training data are scarce.

Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

TL;DR

and

, and enables abstention through thresholding. The approach is demonstrated on a pottery classification task with scarce labeled data, combining a deep core classifier with Fisher-discriminant-based correctors and showing improved conditional recall on accepted predictions alongside theoretical guarantees. This work provides a practical, provable framework for correcting AI errors without heavy reliance on distributional assumptions or large labeled datasets, with immediate implications for trust and safety in deployed AI systems.

Abstract

Paper Structure (18 sections, 2 theorems, 41 equations, 2 figures, 5 tables, 1 algorithm)

This paper contains 18 sections, 2 theorems, 41 equations, 2 figures, 5 tables, 1 algorithm.

Introduction
Notation
Problem Formulation
Main Results
Theory
Discussion
Computing the bounds
Tightness of the bounds
Abstaining from making a decision
Example: AI correctors for pottery classification
Core AI classifier
Data
Data for training core classifier
Data for training AI correctors
AI correctors
...and 3 more sections

Key Result

Theorem 1

Let the AI system $F$, feature map $\Phi$, and corrector training set $\mathcal{S}$ be defined as in eq:classifier:1--eq:training_set. Suppose that the elements of $\mathcal{S}$ are independently sampled from the (unknown) data distribution $P_D$, that $\mathcal{S}_{+,j}$ and $\mathcal{S}_{-,j}$ are and where $\rho,\psi:[0,1]\times\mathbb{N}\rightarrow \mathbb{R}$ are defined as

Figures (2)

Figure 1: Lower bound on the probability of correct rejection of errors, $P (\mathcal{A}(F (u), \Phi(u)) = \ell^{\times}| F (u) = \ell_j , \ \ell\neq\ell_j)$, provided in Theorem \ref{['thm:main_corrector_bounds']} and expressed as a function of the cardinality $M_{-,j}$ of the set $\mathcal{S}_{-,j}$ for different values of $\Delta_j=0.8,0.85,0.9,0.95$.
Figure 2: Examples of images of sherds from different classes. Each row corresponds to a class, with the top row showing images from class $1$, and the bottom row showing images from class $5$.

Theorems & Definitions (2)

Theorem 1
Lemma 1

Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

TL;DR

Abstract

Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (2)