Example-based Explanations for Random Forests using Machine Unlearning

Tanmay Surve; Romila Pradhan

Example-based Explanations for Random Forests using Machine Unlearning

Tanmay Surve, Romila Pradhan

TL;DR

FairDebugger addresses the problem of understanding and debugging fairness violations in tree-based models by identifying training-data subsets that causally contribute to bias. It combines DaRE-RF-based machine unlearning to efficiently estimate subset contributions with an Apriori lattice to prune the subset search space and generate top-$k$ coherent explanations. Across German Credit, Adult Income, and Stop-Frisk datasets, the approach yields interpretable explanations that align with known biases and achieve substantial parity reductions with minimal accuracy loss. This data-centric, example-based debugging framework enables targeted remediation and scalable fairness auditing for random forests and can be extended to other tree-based or black-box models.

Abstract

Tree-based machine learning models, such as decision trees and random forests, have been hugely successful in classification tasks primarily because of their predictive power in supervised learning tasks and ease of interpretation. Despite their popularity and power, these models have been found to produce unexpected or discriminatory outcomes. Given their overwhelming success for most tasks, it is of interest to identify sources of their unexpected and discriminatory behavior. However, there has not been much work on understanding and debugging tree-based classifiers in the context of fairness. We introduce FairDebugger, a system that utilizes recent advances in machine unlearning research to identify training data subsets responsible for instances of fairness violations in the outcomes of a random forest classifier. FairDebugger generates top-$k$ explanations (in the form of coherent training data subsets) for model unfairness. Toward this goal, FairDebugger first utilizes machine unlearning to estimate the change in the tree structures of the random forest when parts of the underlying training data are removed, and then leverages the Apriori algorithm from frequent itemset mining to reduce the subset search space. We empirically evaluate our approach on three real-world datasets, and demonstrate that the explanations generated by FairDebugger are consistent with insights from prior studies on these datasets.

Example-based Explanations for Random Forests using Machine Unlearning

TL;DR

coherent explanations. Across German Credit, Adult Income, and Stop-Frisk datasets, the approach yields interpretable explanations that align with known biases and achieve substantial parity reductions with minimal accuracy loss. This data-centric, example-based debugging framework enables targeted remediation and scalable fairness auditing for random forests and can be extended to other tree-based or black-box models.

Abstract

explanations (in the form of coherent training data subsets) for model unfairness. Toward this goal, FairDebugger first utilizes machine unlearning to estimate the change in the tree structures of the random forest when parts of the underlying training data are removed, and then leverages the Apriori algorithm from frequent itemset mining to reduce the subset search space. We empirically evaluate our approach on three real-world datasets, and demonstrate that the explanations generated by FairDebugger are consistent with insights from prior studies on these datasets.

Paper Structure (16 sections, 5 equations, 4 figures, 11 tables, 1 algorithm)

This paper contains 16 sections, 5 equations, 4 figures, 11 tables, 1 algorithm.

Introduction
Problem Definition
Preliminaries
Explanation Generation
Estimating subset contribution toward bias
Pruning the subset search space
Putting it all together
Experimental Evaluation
Experimental Setup
Datasets
Metrics
Effectiveness of machine unlearning for debugging fairness
Effectiveness of FairDebugger's explanations
Efficiency of FairDebugger
Related Work
...and 1 more sections

Figures (4)

Figure 1: An overview of FairDebugger. (a) Given a random forest classifier trained on some training data, the classifier generates biased predictions on some test data. (b) FairDebugger uses machine unlearning and subset search space pruning techniques to determine the (c) top-$k$ training data subsets responsible for the biased predictions, along with the improvement in fairness of the updated model trained after deleting the susbet from the training data.
Figure 2: Visualization of lattice structure for subset generation. We show the first three levels of an example lattice. At level 1, all nodes consist of a single literal. For example, Gender='Male' indicates data instances where the column Gender has the value 'Male'. At each level, literals are merged two at a time, as illustrated, to generate subsequent subsets.
Figure 3: Effect of DaRE-RF's unlearning capability on fairness for 1000 random subsets and 1000 coherent subsets. The top 3 plots correspond to random subsets for 3 subset support ranges while the bottom 3 plots correspond to coherent subsets. Plots correspond to the predictive parity fairness metric. x-axis indicates model fairness after DaRE-RF's unlearning was used to remove influence of a subset from the model. y-axis corresponds to fairness of model retrained after removing the said subset from the training data. The green line is $y = x$ line.
Figure 4: Runtime of FairDebugger for various dataset dimensions. Dimension refers to num_of_data_instances $\times$ num_of_attributes. We observe that the runtime increases almost quadratically as dataset dimension increases.

Theorems & Definitions (5)

Example 1: German Credit Dataset
Definition 1
Definition 2
Definition 3
Definition 4

Example-based Explanations for Random Forests using Machine Unlearning

TL;DR

Abstract

Example-based Explanations for Random Forests using Machine Unlearning

Authors

TL;DR

Abstract

Table of Contents

Figures (4)

Theorems & Definitions (5)