Towards a general framework for improving the performance of classifiers using XAI methods
Andrea Apicella, Salvatore Giugliano, Francesco Isgrò, Roberto Prevete
TL;DR
The paper addresses the problem of improving the performance of pre-trained classifiers without costly retraining by leveraging XAI explanations. It proposes a general framework where an attribution-encoding module $F$ produces informative features $oldsymbol{f}^{(j)}$ that, when combined with the pre-trained model outputs $oldsymbol{m}^{(j)}$, feed a simple classifier $C$ to enhance decisions. Two training strategies are outlined: an auto-encoder-based pipeline that encodes explanations into $z_x$ before training $F$ and $C$, and an encoder-decoder-based pipeline that jointly trains an encoder–decoder with $C$ using the explanations. The work highlights potential benefits such as reduced computational overhead, possibility of obtaining explanations without explicit XAI steps, and directions for robustness and broader applicability with varying XAI methods and datasets.
Abstract
Modern Artificial Intelligence (AI) systems, especially Deep Learning (DL) models, poses challenges in understanding their inner workings by AI researchers. eXplainable Artificial Intelligence (XAI) inspects internal mechanisms of AI models providing explanations about their decisions. While current XAI research predominantly concentrates on explaining AI systems, there is a growing interest in using XAI techniques to automatically improve the performance of AI systems themselves. This paper proposes a general framework for automatically improving the performance of pre-trained DL classifiers using XAI methods, avoiding the computational overhead associated with retraining complex models from scratch. In particular, we outline the possibility of two different learning strategies for implementing this architecture, which we will call auto-encoder-based and encoder-decoder-based, and discuss their key aspects.
