Table of Contents
Fetching ...

Generalizing Logic-based Explanations for Machine Learning Classifiers via Optimization

Francisco Mateus Rocha Filho, Ajalmar Rêgo da Rocha Neto, Thiago Alves Rocha

TL;DR

Twostep significantly increases explanation coverage and builds upon prior work, generating explanations in a single step for each feature and each bound, eliminating the overhead of an iterative process.

Abstract

Machine learning models support decision-making, yet the reasons behind their predictions are opaque. Clear and reliable explanations help users make informed decisions and avoid blindly trusting model outputs. However, many existing explanation methods fail to guarantee correctness. Logic-based approaches ensure correctness but often offer overly constrained explanations, limiting coverage. Recent work addresses this by incrementally expanding explanations while maintaining correctness. This process is performed separately for each feature, adjusting both its upper and lower bounds. However, this approach faces a trade-off: smaller increments incur high computational costs, whereas larger ones may lead to explanations covering fewer instances. To overcome this, we propose two novel methods. Onestep builds upon this prior work, generating explanations in a single step for each feature and each bound, eliminating the overhead of an iterative process. \textit{Twostep} takes a gradual approach, improving coverage. Experimental results show that Twostep significantly increases explanation coverage (by up to 72.60\% on average across datasets) compared to Onestep and, consequently, to prior work.

Generalizing Logic-based Explanations for Machine Learning Classifiers via Optimization

TL;DR

Twostep significantly increases explanation coverage and builds upon prior work, generating explanations in a single step for each feature and each bound, eliminating the overhead of an iterative process.

Abstract

Machine learning models support decision-making, yet the reasons behind their predictions are opaque. Clear and reliable explanations help users make informed decisions and avoid blindly trusting model outputs. However, many existing explanation methods fail to guarantee correctness. Logic-based approaches ensure correctness but often offer overly constrained explanations, limiting coverage. Recent work addresses this by incrementally expanding explanations while maintaining correctness. This process is performed separately for each feature, adjusting both its upper and lower bounds. However, this approach faces a trade-off: smaller increments incur high computational costs, whereas larger ones may lead to explanations covering fewer instances. To overcome this, we propose two novel methods. Onestep builds upon this prior work, generating explanations in a single step for each feature and each bound, eliminating the overhead of an iterative process. \textit{Twostep} takes a gradual approach, improving coverage. Experimental results show that Twostep significantly increases explanation coverage (by up to 72.60\% on average across datasets) compared to Onestep and, consequently, to prior work.
Paper Structure (14 sections, 34 equations, 11 figures, 7 tables, 3 algorithms)

This paper contains 14 sections, 34 equations, 11 figures, 7 tables, 3 algorithms.

Figures (11)

  • Figure 1: Example of how the range of a feature $f_i$ is computed with an incremental approach.
  • Figure 2: Example of how the range of a feature $f_i$ is computed with Onestep.
  • Figure 3: Illustration of how a partial range for $f_i$ is computed with Twostep.
  • Figure 4: Distribution of dataset coverage improvement (%) achieved by Twostep over Onestep for SVM, with Twostep parameter fixed at $p=0.25$. Red bars represent cases with worsened coverage, gray bars represent cases with same coverage, and the green bars represent cases with improved coverage. The best improvement and the worst deterioration are highlighted.
  • Figure 5: Distribution of dataset coverage improvement (%) achieved by Twostep over Onestep for MLP, with Twostep parameter fixed at $p=0.25$. Red bars represent cases with worsened coverage, gray bars represent cases with same coverage, and green bars represent cases with improved coverage. The best improvement and the worst deterioration are highlighted.
  • ...and 6 more figures