CONFIDERAI: a novel CONFormal Interpretable-by-Design score function for Explainable and Reliable Artificial Intelligence

Sara Narteni; Alberto Carlevaro; Fabrizio Dabbene; Marco Muselli; Maurizio Mongelli

CONFIDERAI: a novel CONFormal Interpretable-by-Design score function for Explainable and Reliable Artificial Intelligence

Sara Narteni, Alberto Carlevaro, Fabrizio Dabbene, Marco Muselli, Maurizio Mongelli

TL;DR

The paper addresses the need for reliable and trustworthy AI by introducing conformity as a sixth pillar and coupling conformal prediction with explainable rule-based models through CONFIDERAI, a geometry-informed score that accounts for rule performance, boundary proximity, and overlaps via geometric similarity. It defines conformal critical sets to localize regions of the feature space where probabilistic guarantees hold and uses CCS to retrain rule sets with improved precision for the target class while preserving interpretability. The approach is instantiated on the Logic Learning Machine and validated across ten real-world datasets, showing that conformal predictions yield bounded errors with varying $\varepsilon$ and that CCS-labeling can improve precision and reduce false positives. Overall, CONFIDERAI provides a practical, interpretable framework for obtaining probabilistic guarantees in rule-based AI, with potential extensions to multi-class settings and broader rule-based models.

Abstract

Everyday life is increasingly influenced by artificial intelligence, and there is no question that machine learning algorithms must be designed to be reliable and trustworthy for everyone. Specifically, computer scientists consider an artificial intelligence system safe and trustworthy if it fulfills five pillars: explainability, robustness, transparency, fairness, and privacy. In addition to these five, we propose a sixth fundamental aspect: conformity, that is, the probabilistic assurance that the system will behave as the machine learner expects. In this paper, we present a methodology to link conformal prediction with explainable machine learning by defining a new score function for rule-based classifiers that leverages rules predictive ability, the geometrical position of points within rules boundaries and the overlaps among rules as well, thanks to the definition of a geometrical rule similarity term. Furthermore, we address the problem of defining regions in the feature space where conformal guarantees are satisfied, by exploiting the definition of conformal critical set and showing how this set can be used to achieve new rules with improved performance on the target class. The overall methodology is tested with promising results on several datasets of real-world interest, such as domain name server tunneling detection or cardiovascular disease prediction.

CONFIDERAI: a novel CONFormal Interpretable-by-Design score function for Explainable and Reliable Artificial Intelligence

TL;DR

and that CCS-labeling can improve precision and reduce false positives. Overall, CONFIDERAI provides a practical, interpretable framework for obtaining probabilistic guarantees in rule-based AI, with potential extensions to multi-class settings and broader rule-based models.

Abstract

Paper Structure (15 sections, 20 equations, 5 figures, 3 tables)

This paper contains 15 sections, 20 equations, 5 figures, 3 tables.

Introduction
Contribution
Related Works
Conformal Predictions and Critical Sets
Rule-Based Conformity
Rule-based classifiers notation
Geometrical Rule Similarity
CONFIDERAI score function
Toy examples
Logic Learning Machine
Experimental Results
Datasets description
Score function evaluation
Evaluation of Conformal Critical Sets
Conclusion

Figures (5)

Figure 1: Changes in the values of the score $s(\boldsymbol{x},y)$ for points within a set of toy rules designed to have different overlap levels
Figure 3: 2D scatter plots of three datasets, showing 2D boundaries of the most relevant rules from the original LLM classifier (yellow box) and from the new model derived via the conformal critical set ($\mathcal{S}_\varepsilon$ rule, green box).
Figure : (a) Average errors
Figure : (a) Average errors
Figure : (b) Average sizes

Theorems & Definitions (1)

Remark 3.1: On the meaning of critical points

CONFIDERAI: a novel CONFormal Interpretable-by-Design score function for Explainable and Reliable Artificial Intelligence

TL;DR

Abstract

CONFIDERAI: a novel CONFormal Interpretable-by-Design score function for Explainable and Reliable Artificial Intelligence

Authors

TL;DR

Abstract

Table of Contents

Figures (5)

Theorems & Definitions (1)