Table of Contents
Fetching ...

Accelerated Aggregated D-Optimal Designs for Estimating Main Effects in Black-Box Models

Chih-Yu Chang, Ming-Chung Chang

TL;DR

This work tackles the challenge of robust, model-agnostic estimation of main effects for black-box predictors, especially under feature correlation. It introduces A2D2E, an accelerated aggregated D-Optimal Designs estimator that preserves ALE localization while using D-optimal design to estimate local slopes, yielding improved variance properties and consistency without requiring differentiability. Theoretical results establish variance reduction and consistency, and extensive simulations plus real-data and LLM-based case studies demonstrate that A2D2E outperforms PD and ALE, particularly in correlated settings. The approach offers practical, scalable interpretability for modern ML applications, including neural networks, Gaussian processes, and language-model surrogates, with broad applicability to real-world decision making.

Abstract

Recent advances in supervised learning have driven growing interest in explaining black-box models, particularly by estimating the effects of input variables on model predictions. However, existing approaches often face key limitations, including poor scalability, sensitivity to out-of-distribution sampling, and instability under correlated features. To address these issues, we propose A2D2E, an $\textbf{E}$stimator based on $\textbf{A}$ccelerated $\textbf{A}$ggregated $\textbf{D}$-Optimal $\textbf{D}$esigns. Our method leverages principled experimental design to improve efficiency and robustness in main effect estimation. We establish theoretical guarantees, including convergence and variance reduction, and validate A2D2E through extensive simulations. We further provide the potential of the proposed method with a case study on real data and applications in language models. The code to reproduce the results can be found at https://github.com/cchihyu/A2D2E.

Accelerated Aggregated D-Optimal Designs for Estimating Main Effects in Black-Box Models

TL;DR

This work tackles the challenge of robust, model-agnostic estimation of main effects for black-box predictors, especially under feature correlation. It introduces A2D2E, an accelerated aggregated D-Optimal Designs estimator that preserves ALE localization while using D-optimal design to estimate local slopes, yielding improved variance properties and consistency without requiring differentiability. Theoretical results establish variance reduction and consistency, and extensive simulations plus real-data and LLM-based case studies demonstrate that A2D2E outperforms PD and ALE, particularly in correlated settings. The approach offers practical, scalable interpretability for modern ML applications, including neural networks, Gaussian processes, and language-model surrogates, with broad applicability to real-world decision making.

Abstract

Recent advances in supervised learning have driven growing interest in explaining black-box models, particularly by estimating the effects of input variables on model predictions. However, existing approaches often face key limitations, including poor scalability, sensitivity to out-of-distribution sampling, and instability under correlated features. To address these issues, we propose A2D2E, an stimator based on ccelerated ggregated -Optimal esigns. Our method leverages principled experimental design to improve efficiency and robustness in main effect estimation. We establish theoretical guarantees, including convergence and variance reduction, and validate A2D2E through extensive simulations. We further provide the potential of the proposed method with a case study on real data and applications in language models. The code to reproduce the results can be found at https://github.com/cchihyu/A2D2E.

Paper Structure

This paper contains 31 sections, 3 theorems, 32 equations, 4 figures, 3 tables, 1 algorithm.

Key Result

Lemma 1

(Proved in Appendix appA1) Suppose that Assumption assump holds. Then its variance is given by

Figures (4)

  • Figure 1: Comparison between ALE and A2D2E. The contour represents the prediction model. The x-axis shows the variable of interest, the black vertical line indicates the location at which the main effect is estimated, and the blue vertical lines denote the bin boundaries (7 bins in total).
  • Figure 2: Estimated main-effect function under the simple-1 setting with low dependence level, comparing the proposed A2D2E with ALE. The red curves are the true main-effect function computed by (\ref{['eq:truth']}).
  • Figure 3: Estimated main-effect functions for the variables year, acceleration, horsepower, and weight using PD, ALE, and the proposed A2D2E algorithms.
  • Figure 4: Estimated main-effects of the log-odds of classifying a sample as versicolor for the variables petal length, petal width, sepal length, and sepal width, using PD, ALE, and the proposed A2D2E algorithms.

Theorems & Definitions (5)

  • Lemma 1
  • Lemma 2
  • Theorem 1
  • proof
  • proof