Table of Contents
Fetching ...

Sensitivity analysis from a single input/output sample

Sébastien Da Veiga, Fabrice Gamboa, Thierry Klein, Agnès Lagnoux, Clémentine Prieur

TL;DR

This work addresses how to estimate closed Sobol' indices $S^X= rac{ ext{Var}( ext{E}[Y|X])}{ ext{Var}(Y)}$ from a single i.i.d. input/output sample, without restrictive input independence. It introduces two mirror-type, high-order kernel regression estimators for the regression function $m(x)=\text{E}[Y|X=x]$, derives efficient influence-function-based estimators for $T=\text{E}[\text{E}[Y|X]^2]$, and proves $\sqrt{n}$-consistency and asymptotic efficiency via central limit theorems. The paper compares these estimators to PF, NN, and other kernel approaches, showing asymptotic efficiency and favorable finite-sample performance on standard test functions and a flood model. The resulting methodology provides a practical, theoretically sound tool for global sensitivity analysis when only a single $n$-sample is available.

Abstract

The main objective of this paper is to estimate optimally Sobol' indices at any order when a unique input/output i.i.d.\ sample is available. Our approach stands on three main ingredients: semi-parametric estimation theory, high-order kernel estimation (inspired by the paper of Doksum in 1995), and mirror-type transformations as introduced in Bertin 2020 and Pujol 2022. We propose two different estimators. We prove that these estimators are asymptotically normal and efficient. Furthermore, we illustrate their numerical properties on standard examples.

Sensitivity analysis from a single input/output sample

TL;DR

This work addresses how to estimate closed Sobol' indices from a single i.i.d. input/output sample, without restrictive input independence. It introduces two mirror-type, high-order kernel regression estimators for the regression function , derives efficient influence-function-based estimators for , and proves -consistency and asymptotic efficiency via central limit theorems. The paper compares these estimators to PF, NN, and other kernel approaches, showing asymptotic efficiency and favorable finite-sample performance on standard test functions and a flood model. The resulting methodology provides a practical, theoretically sound tool for global sensitivity analysis when only a single -sample is available.

Abstract

The main objective of this paper is to estimate optimally Sobol' indices at any order when a unique input/output i.i.d.\ sample is available. Our approach stands on three main ingredients: semi-parametric estimation theory, high-order kernel estimation (inspired by the paper of Doksum in 1995), and mirror-type transformations as introduced in Bertin 2020 and Pujol 2022. We propose two different estimators. We prove that these estimators are asymptotically normal and efficient. Furthermore, we illustrate their numerical properties on standard examples.
Paper Structure (27 sections, 11 theorems, 87 equations, 10 figures)

This paper contains 27 sections, 11 theorems, 87 equations, 10 figures.

Key Result

Lemma 3.1

Under Assumptions hyp:domain, hyp:lipschitz, and hyp:kernel, for all $i\in \{1,\cdots,d\}$,

Figures (10)

  • Figure 1: Mirror-type transformation defined in \ref{['def:mirror']} with $d=2$, for $x=(1/3,3/4)$, and for $y=(2/3,1/5)$.
  • Figure 2: Mirror-image transformation with $d=2$ in red. In the left-hand side of the figure, a data-point in $[0,1]^d$ (in green) and its 8 mirror-images (in orange). A kernel is fitted over all points of the augmented dataset (orange areas). The darker orange regions correspond to the regions where several kernels overlap. In the right-hand side of the figure, the data-point (in green) and the restriction to $[0,1]^d$ of the augmented dataset (in green).
  • Figure 3: Estimators for first-order indices of the Bratley function with $n=500$ (left) and $n=1000$ (right). The reference value is represented with a gray line.
  • Figure 4: Estimators for total indices of the Bratley function with $n=500$ (left) and $n=1000$ (right). The reference value of the index is represented with a gray line.
  • Figure 5: Estimators for first-order indices of the g-Sobol function with $n=500$ (left) and $n=1000$ (right). The reference value of the index is represented with a gray line.
  • ...and 5 more figures

Theorems & Definitions (22)

  • Lemma 3.1
  • Lemma 3.2
  • Remark 3.3
  • Corollary 3.4
  • Lemma 3.5
  • Lemma 3.6
  • Corollary 3.7
  • Theorem 4.1: Central limit theorem
  • Proposition 4.2: Asymptotic efficiency for $\widehat{T}_{n,h_n}$ and $\widetilde{T}_{n,h_n}$
  • Corollary 4.3: Central limit theorem and asymptotic efficiency for $\widehat{S}_{n,h_n}$ and $\widetilde{S}_{n,h_n}$
  • ...and 12 more