Table of Contents
Fetching ...

Improving Local Fidelity Through Sampling and Modeling Nonlinearity

Sanjeev Shrestha, Rahul Dubey, Hui Liu

TL;DR

Addresses the fidelity gap in local explanations by replacing the linear surrogate in LIME with a nonlinear MARS surrogate trained on locally sampled points via N-ball sampling. The method, mLIME, captures nonlinear local decision boundaries and avoids reweighting by directly sampling from the local distribution. Empirical results on three UCI datasets across multiple classifiers show substantial RMSE reductions versus LIME and LEMON, indicating higher faithfulness and stability. This approach enhances the reliability of local explanations in high-stakes settings and paves the way for extending to other data modalities.

Abstract

With the increasing complexity of black-box machine learning models and their adoption in high-stakes areas, it is critical to provide explanations for their predictions. Local Interpretable Model-agnostic Explanation (LIME) is a widely used technique that explains the prediction of any classifier by learning an interpretable model locally around the predicted instance. However, it assumes that the local decision boundary is linear and fails to capture the non-linear relationships, leading to incorrect explanations. In this paper, we propose a novel method that can generate high-fidelity explanations. Multivariate adaptive regression splines (MARS) is used to model non-linear local boundaries that effectively captures the underlying behavior of the reference model, thereby enhancing the local fidelity of the explanation. Additionally, we utilize the N-ball sampling technique, which samples directly from the desired distribution instead of reweighting samples as done in LIME, further improving the faithfulness score. We evaluate our method on three UCI datasets across different classifiers and varying kernel widths. Experimental results show that our method yields more faithful explanations compared to baselines, achieving an average reduction of 37% in root mean square error, significantly improving local fidelity.

Improving Local Fidelity Through Sampling and Modeling Nonlinearity

TL;DR

Addresses the fidelity gap in local explanations by replacing the linear surrogate in LIME with a nonlinear MARS surrogate trained on locally sampled points via N-ball sampling. The method, mLIME, captures nonlinear local decision boundaries and avoids reweighting by directly sampling from the local distribution. Empirical results on three UCI datasets across multiple classifiers show substantial RMSE reductions versus LIME and LEMON, indicating higher faithfulness and stability. This approach enhances the reliability of local explanations in high-stakes settings and paves the way for extending to other data modalities.

Abstract

With the increasing complexity of black-box machine learning models and their adoption in high-stakes areas, it is critical to provide explanations for their predictions. Local Interpretable Model-agnostic Explanation (LIME) is a widely used technique that explains the prediction of any classifier by learning an interpretable model locally around the predicted instance. However, it assumes that the local decision boundary is linear and fails to capture the non-linear relationships, leading to incorrect explanations. In this paper, we propose a novel method that can generate high-fidelity explanations. Multivariate adaptive regression splines (MARS) is used to model non-linear local boundaries that effectively captures the underlying behavior of the reference model, thereby enhancing the local fidelity of the explanation. Additionally, we utilize the N-ball sampling technique, which samples directly from the desired distribution instead of reweighting samples as done in LIME, further improving the faithfulness score. We evaluate our method on three UCI datasets across different classifiers and varying kernel widths. Experimental results show that our method yields more faithful explanations compared to baselines, achieving an average reduction of 37% in root mean square error, significantly improving local fidelity.

Paper Structure

This paper contains 8 sections, 6 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: The workflow of the framework includes the following steps: (1) First, select a target instance to be explained, (2) generate synthetic samples within its neighbourhood using N-ball sampling, (3) assign labels to these samples using the reference model, and (4) train a MARS surrogate model to approximate the non-linear local decision boundary. Unlike the sampling method in LIME, which samples across the entire feature space and reweights them based on the proximity, N-ball sampling directly samples from the desired local distribution within a specified radius. Since it provides a denser set of local samples, its surrogate model more faithfully captures the reference model's behaviour.