Table of Contents
Fetching ...

Rule by Rule: Learning with Confidence through Vocabulary Expansion

Albert Nössig, Tobias Hell, Georg Moser

TL;DR

This paper proposes an innovative iterative approach to rule learning, specifically designed for text-based data, that focuses on progressively expanding the vocabulary used in each iteration, resulting in a significant reduction in memory consumption.

Abstract

In this paper, we present an innovative iterative approach to rule learning specifically designed for (but not limited to) text-based data. Our method focuses on progressively expanding the vocabulary utilized in each iteration resulting in a significant reduction of memory consumption. Moreover, we introduce a Value of Confidence as an indicator of the reliability of the generated rules. By leveraging the Value of Confidence, our approach ensures that only the most robust and trustworthy rules are retained, thereby improving the overall quality of the rule learning process. We demonstrate the effectiveness of our method through extensive experiments on various textual as well as non-textual datasets including a use case of significant interest to insurance industries, showcasing its potential for real-world applications.

Rule by Rule: Learning with Confidence through Vocabulary Expansion

TL;DR

This paper proposes an innovative iterative approach to rule learning, specifically designed for text-based data, that focuses on progressively expanding the vocabulary used in each iteration, resulting in a significant reduction in memory consumption.

Abstract

In this paper, we present an innovative iterative approach to rule learning specifically designed for (but not limited to) text-based data. Our method focuses on progressively expanding the vocabulary utilized in each iteration resulting in a significant reduction of memory consumption. Moreover, we introduce a Value of Confidence as an indicator of the reliability of the generated rules. By leveraging the Value of Confidence, our approach ensures that only the most robust and trustworthy rules are retained, thereby improving the overall quality of the rule learning process. We demonstrate the effectiveness of our method through extensive experiments on various textual as well as non-textual datasets including a use case of significant interest to insurance industries, showcasing its potential for real-world applications.

Paper Structure

This paper contains 22 sections, 1 equation, 8 figures, 4 tables, 1 algorithm.

Figures (8)

  • Figure 1: Modular Approach to Rule Learning. The first phase (Representation Learning) is intended to yield a compact representation of the original (high-dimensional) input data. This is advantageous for clustering applied subsequently during the second phase (Input Selection). These two steps put in front of the application of a chosen Rule Learner in the final phase make it possible to find comprehensible rules on very large data sets in reasonable time.
  • Figure 2: Voting Approach for end-to-end Explainable Classification. Generally the approach distinguishes between two basic scenarios, namely coinciding predictions given by the rule learners as well as conflicting ones. In a first step only the explainable methods are considered using the corresponding prediction in case they match. Otherwise, an (unexplainable) state-of-the-art method -- the so-called decider -- is consulted to resolve the existing rule conflict.
  • Figure 3: Illustration of Accuracies shown in Table \ref{['tab:benchmark_results_text']}.
  • Figure 4: Illustration of Memory Consumptions shown in Table \ref{['tab:benchmark_results_text']}. Note that the memory consumption illustrated for $\mathsf{RIPPER}$ applied on the IMDB data set corresponds to a reduced feature space compared to the application of $\mathsf{FOIL}$.
  • Figure 5: Illustration of Accuracies shown in Table \ref{['tab:benchmark_results_nominal']}.
  • ...and 3 more figures

Theorems & Definitions (1)

  • definition thmcounterdefinition