Table of Contents
Fetching ...

A Voting Approach for Explainable Classification with Rule Learning

Albert Nössig, Tobias Hell, Georg Moser

TL;DR

This paper introduces a voting approach combining both worlds, aiming to achieve comparable results as (unexplainable) state-of-the-art methods, while still providing explanations in the form of deterministic rules.

Abstract

State-of-the-art results in typical classification tasks are mostly achieved by unexplainable machine learning methods, like deep neural networks, for instance. Contrarily, in this paper, we investigate the application of rule learning methods in such a context. Thus, classifications become based on comprehensible (first-order) rules, explaining the predictions made. In general, however, rule-based classifications are less accurate than state-of-the-art results (often significantly). As main contribution, we introduce a voting approach combining both worlds, aiming to achieve comparable results as (unexplainable) state-of-the-art methods, while still providing explanations in the form of deterministic rules. Considering a variety of benchmark data sets including a use case of significant interest to insurance industries, we prove that our approach not only clearly outperforms ordinary rule learning methods, but also yields results on a par with state-of-the-art outcomes.

A Voting Approach for Explainable Classification with Rule Learning

TL;DR

This paper introduces a voting approach combining both worlds, aiming to achieve comparable results as (unexplainable) state-of-the-art methods, while still providing explanations in the form of deterministic rules.

Abstract

State-of-the-art results in typical classification tasks are mostly achieved by unexplainable machine learning methods, like deep neural networks, for instance. Contrarily, in this paper, we investigate the application of rule learning methods in such a context. Thus, classifications become based on comprehensible (first-order) rules, explaining the predictions made. In general, however, rule-based classifications are less accurate than state-of-the-art results (often significantly). As main contribution, we introduce a voting approach combining both worlds, aiming to achieve comparable results as (unexplainable) state-of-the-art methods, while still providing explanations in the form of deterministic rules. Considering a variety of benchmark data sets including a use case of significant interest to insurance industries, we prove that our approach not only clearly outperforms ordinary rule learning methods, but also yields results on a par with state-of-the-art outcomes.
Paper Structure (27 sections, 10 figures, 9 tables, 1 algorithm)

This paper contains 27 sections, 10 figures, 9 tables, 1 algorithm.

Figures (10)

  • Figure 1: Motivational Example: Procedure of the proposed voting approach with the example of the MNIST digits distinguishing between the two basic scenarios, namely coinciding predictions given by the rule learners as well as conflicting ones. In a first step only the explainable methods are considered using the corresponding prediction in case they match. Otherwise, an (unexplainable) state-of-the-art method -- the so-called decider -- is consulted to resolve the existing rule conflict.
  • Figure 2: Visualisation of a decision tree constructed for the nominal weather data set.
  • Figure 3: Modular Approach to Rule Learning. The first phase (Representation Learning) is intended to yield a compact representation of the original (high-dimensional) input data. This is advantageous for clustering applied subsequently during the second phase (Input Selection). These two steps put in front of the application of a chosen Rule Learner in the final phase make it possible to find comprehensible rules on very large data sets in reasonable time.
  • Figure 4: Illustration of a learned rule in the context of classifying whether a given digit is equal to zero or not. The red crosses correspond to an expected positive pixel intensity (black_n(V)), while the blue diamonds in the middle demand zero pixel intensity (white_n(V)).
  • Figure 5: Illustration of Accuracies shown in Table \ref{['accuracies']}.
  • ...and 5 more figures