Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees
Ivan Y. Tyukin, Tatiana Tyukina, Daniel van Helden, Zedong Zheng, Evgeny M. Mirkes, Oliver J. Sutton, Qinghua Zhou, Alexander N. Gorban, Penelope Allison
TL;DR
The paper addresses AI error handling by introducing weakly supervised AI error correctors with a priori, distribution-agnostic performance guarantees. It develops an algorithm that leverages low-dimensional projections of classifier representations to produce per-class acceptance and rejection bounds, expressed via functions $\rho$ and $\psi$, and enables abstention through thresholding. The approach is demonstrated on a pottery classification task with scarce labeled data, combining a deep core classifier with Fisher-discriminant-based correctors and showing improved conditional recall on accepted predictions alongside theoretical guarantees. This work provides a practical, provable framework for correcting AI errors without heavy reliance on distributional assumptions or large labeled datasets, with immediate implications for trust and safety in deployed AI systems.
Abstract
We present a new methodology for handling AI errors by introducing weakly supervised AI error correctors with a priori performance guarantees. These AI correctors are auxiliary maps whose role is to moderate the decisions of some previously constructed underlying classifier by either approving or rejecting its decisions. The rejection of a decision can be used as a signal to suggest abstaining from making a decision. A key technical focus of the work is in providing performance guarantees for these new AI correctors through bounds on the probabilities of incorrect decisions. These bounds are distribution agnostic and do not rely on assumptions on the data dimension. Our empirical example illustrates how the framework can be applied to improve the performance of an image classifier in a challenging real-world task where training data are scarce.
