Statistical inference using machine learning and classical techniques based on accumulated local effects (ALE)

Chitu Okoli

Statistical inference using machine learning and classical techniques based on accumulated local effects (ALE)

Chitu Okoli

TL;DR

This work advances $ALE$ as a robust, model-agnostic framework for global ML explanations by addressing reliability on small datasets, introducing interpretable $ALE$-based effect sizes ($ALER$, $ALED$, $NALER$, $NALED$), and establishing bootstrapped confidence regions for inference; it also prescribes full-model bootstrapping to mitigate overfitting in small samples and demonstrates these methods on a large (diamonds) and a small (math achievement) dataset, with implementations in the $ale$ package for R. The contributions enable reliable, nuanced conclusions about predictor effects across the entire domain, balancing effect size summaries with domain-specific confidence regions that reveal heterogeneous patterns. The practical impact lies in providing researchers and practitioners with scalable, interpretable tools for statistical inference in ML contexts, including clear guidance on when effects are practically meaningful beyond statistical significance. Together, these advances deepen the interpretability and trustworthiness of model explanations in diverse applied settings, particularly where data are limited or effects are highly non-linear.

Abstract

Accumulated Local Effects (ALE) is a model-agnostic approach for global explanations of the results of black-box machine learning (ML) algorithms. There are at least three challenges with conducting statistical inference based on ALE: ensuring the reliability of ALE analyses, especially in the context of small datasets; intuitively characterizing a variable's overall effect in ML; and making robust inferences from ML data analysis. In response, we introduce innovative tools and techniques for statistical inference using ALE, establishing bootstrapped confidence intervals tailored to dataset size and introducing ALE effect size measures that intuitively indicate effects on both the outcome variable scale and a normalized scale. Furthermore, we demonstrate how to use these tools to draw reliable statistical inferences, reflecting the flexible patterns ALE adeptly highlights, with implementations available in the 'ale' package in R. This work propels the discourse on ALE and its applicability in ML and statistical analysis forward, offering practical solutions to prevailing challenges in the field.

Statistical inference using machine learning and classical techniques based on accumulated local effects (ALE)

TL;DR

This work advances

as a robust, model-agnostic framework for global ML explanations by addressing reliability on small datasets, introducing interpretable

-based effect sizes (

), and establishing bootstrapped confidence regions for inference; it also prescribes full-model bootstrapping to mitigate overfitting in small samples and demonstrates these methods on a large (diamonds) and a small (math achievement) dataset, with implementations in the

package for R. The contributions enable reliable, nuanced conclusions about predictor effects across the entire domain, balancing effect size summaries with domain-specific confidence regions that reveal heterogeneous patterns. The practical impact lies in providing researchers and practitioners with scalable, interpretable tools for statistical inference in ML contexts, including clear guidance on when effects are practically meaningful beyond statistical significance. Together, these advances deepen the interpretability and trustworthiness of model explanations in diverse applied settings, particularly where data are limited or effects are highly non-linear.

Abstract

Paper Structure (36 sections, 5 equations, 12 figures)

This paper contains 36 sections, 5 equations, 12 figures.

Introduction
Related work
Software implementations of ALE
ALE confidence intervals and bootstrapping
Effect size measures for machine learning
Inference from analysis results
Opportunities for improvement
Illustrative datasets and models
Large dataset: random forest model for diamond prices
Small dataset: generalized additive model for mathematics achievement scores
Bootstrapping of accumulated local effects
Data-only bootstrapping of a large dataset (diamonds)
Model bootstrapping of a small dataset (math achievement)
Inappropriate data-only bootstrapping of a small dataset
Appropriate full-model bootstrapping of a small dataset
...and 21 more sections

Figures (12)

Figure 1: Simple ALE plots for random forest model of diamond prices
Figure 2: Zoom-in of ALE plot for x_length for diamond prices
Figure 3: Zoom-in of ALE plot for rand_norm for diamond prices
Figure 4: Bootstrapped ALE plots for random forest model of diamond prices
Figure 5: Simple ALE plots for GAM of mathematics achievement scores
...and 7 more figures

Statistical inference using machine learning and classical techniques based on accumulated local effects (ALE)

TL;DR

Abstract

Statistical inference using machine learning and classical techniques based on accumulated local effects (ALE)

Authors

TL;DR

Abstract

Table of Contents

Figures (12)