Table of Contents
Fetching ...

MoRE-LLM: Mixture of Rule Experts Guided by a Large Language Model

Alexander Koebler, Ingo Thon, Florian Buettner

TL;DR

MoRE-LLM addresses the challenge of trustworthy, interpretable predictions by coupling a small task-specific model with a learned rule set through a gating mechanism, and by refining these rules with an LLM during training. The approach uses Anchors to generate local rule surrogates and a constrained optimization framework based on Dynamic Barrier Gradient Descent to keep predictive performance close to an unconstrained model while increasing rule usage. A two-phase LLM process—adaptation during rule refinement and pruning for alignment—ensures domain knowledge is embedded into the rule set without requiring LLM access at deployment. Experiments on tabular datasets show MoRE-LLM can achieve competitive accuracy with significantly more domain-aligned, high-fidelity explanations than purely white-box methods and offer interpretability comparable to non-interpretable baselines.

Abstract

To ensure the trustworthiness and interpretability of AI systems, it is essential to align machine learning models with human domain knowledge. This can be a challenging and time-consuming endeavor that requires close communication between data scientists and domain experts. Recent leaps in the capabilities of Large Language Models (LLMs) can help alleviate this burden. In this paper, we propose a Mixture of Rule Experts guided by a Large Language Model (MoRE-LLM) which combines a data-driven black-box model with knowledge extracted from an LLM to enable domain knowledge-aligned and transparent predictions. While the introduced Mixture of Rule Experts (MoRE) steers the discovery of local rule-based surrogates during training and their utilization for the classification task, the LLM is responsible for enhancing the domain knowledge alignment of the rules by correcting and contextualizing them. Importantly, our method does not rely on access to the LLM during test time and ensures interpretability while not being prone to LLM-based confabulations. We evaluate our method on several tabular data sets and compare its performance with interpretable and non-interpretable baselines. Besides performance, we evaluate our grey-box method with respect to the utilization of interpretable rules. In addition to our quantitative evaluation, we shed light on how the LLM can provide additional context to strengthen the comprehensibility and trustworthiness of the model's reasoning process.

MoRE-LLM: Mixture of Rule Experts Guided by a Large Language Model

TL;DR

MoRE-LLM addresses the challenge of trustworthy, interpretable predictions by coupling a small task-specific model with a learned rule set through a gating mechanism, and by refining these rules with an LLM during training. The approach uses Anchors to generate local rule surrogates and a constrained optimization framework based on Dynamic Barrier Gradient Descent to keep predictive performance close to an unconstrained model while increasing rule usage. A two-phase LLM process—adaptation during rule refinement and pruning for alignment—ensures domain knowledge is embedded into the rule set without requiring LLM access at deployment. Experiments on tabular datasets show MoRE-LLM can achieve competitive accuracy with significantly more domain-aligned, high-fidelity explanations than purely white-box methods and offer interpretability comparable to non-interpretable baselines.

Abstract

To ensure the trustworthiness and interpretability of AI systems, it is essential to align machine learning models with human domain knowledge. This can be a challenging and time-consuming endeavor that requires close communication between data scientists and domain experts. Recent leaps in the capabilities of Large Language Models (LLMs) can help alleviate this burden. In this paper, we propose a Mixture of Rule Experts guided by a Large Language Model (MoRE-LLM) which combines a data-driven black-box model with knowledge extracted from an LLM to enable domain knowledge-aligned and transparent predictions. While the introduced Mixture of Rule Experts (MoRE) steers the discovery of local rule-based surrogates during training and their utilization for the classification task, the LLM is responsible for enhancing the domain knowledge alignment of the rules by correcting and contextualizing them. Importantly, our method does not rely on access to the LLM during test time and ensures interpretability while not being prone to LLM-based confabulations. We evaluate our method on several tabular data sets and compare its performance with interpretable and non-interpretable baselines. Besides performance, we evaluate our grey-box method with respect to the utilization of interpretable rules. In addition to our quantitative evaluation, we shed light on how the LLM can provide additional context to strengthen the comprehensibility and trustworthiness of the model's reasoning process.

Paper Structure

This paper contains 13 sections, 3 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: In MoRE-LLM, the LLM is utilized in two steps of the model’s life-cycle. During training, it aligns discovered rules with domain knowledge, while during testing, insights generated by the LLM augment the model’s interpretability.
  • Figure 2: Overall MoRE-LLM architecture. The elements encapsulated by the blue box consisting of gating model $g$, black-box classifier $f$ and the rule-based classifier $r$ including rule set $\mathcal{R}$ are required during test time. The training set $\mathcal{D}$, the large language model $Q$ and the explainer module $E$ in the red box are only necessary during training time.
  • Figure 3: Examples for LLM based rule refinement. The rule adaptation example on the diabetes dataset (top) relies on specific knowledge about health factors while the rule pruning example on the adult dataset (bottom) discovered contradictions in the context of the other rules.
  • Figure 4: Rule coverage and utilization (a) as well as test accuracy and accuracy of the generated rules (b) for MoRE-LLM (MLP) on a test set for the diabetes classification task across five consecutive steps.