Readable Twins of Unreadable Models

Krzysztof Pancerz; Piotr Kulicki; Michał Kalisz; Andrzej Burda; Maciej Stanisławski; Jaromir Sarzyński

Readable Twins of Unreadable Models

Krzysztof Pancerz, Piotr Kulicki, Michał Kalisz, Andrzej Burda, Maciej Stanisławski, Jaromir Sarzyński

TL;DR

The paper addresses the challenge of explaining unreadable deep learning models by constructing readable twins, imprecise information flow models, via a Sequential Information System and rough set flow graphs. It details a stepwise pipeline (training, clustering activations, SIS construction, RSFG creation, path mining with an evolutionary algorithm, and visualization) and demonstrates the approach on MNIST digit classification. The key contribution is a concrete methodology that translates internal model behavior into human-readable, path-based explanations, with visual summaries. This work enables principled interpretability of deep models and opens avenues for ontology-based extensions and alternative graphical representations such as Petri nets.

Abstract

Creating responsible artificial intelligence (AI) systems is an important issue in contemporary research and development of works on AI. One of the characteristics of responsible AI systems is their explainability. In the paper, we are interested in explainable deep learning (XDL) systems. On the basis of the creation of digital twins of physical objects, we introduce the idea of creating readable twins (in the form of imprecise information flow models) for unreadable deep learning models. The complete procedure for switching from the deep learning model (DLM) to the imprecise information flow model (IIFM) is presented. The proposed approach is illustrated with an example of a deep learning classification model for image recognition of handwritten digits from the MNIST data set.

Readable Twins of Unreadable Models

TL;DR

Abstract

Readable Twins of Unreadable Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)