Table of Contents
Fetching ...

Automated Efficient Estimation using Monte Carlo Efficient Influence Functions

Raj Agrawal, Sam Witty, Andy Zane, Eli Bingham

TL;DR

It is proved that MC-EIF is consistent, and that estimators using MC-EIF achieve optimal $\sqrt{N}$ convergence rates, and it is shown empirically that estimators using MC-EIF are at parity with estimators using analytic EIFs.

Abstract

Many practical problems involve estimating low dimensional statistical quantities with high-dimensional models and datasets. Several approaches address these estimation tasks based on the theory of influence functions, such as debiased/double ML or targeted minimum loss estimation. This paper introduces \textit{Monte Carlo Efficient Influence Functions} (MC-EIF), a fully automated technique for approximating efficient influence functions that integrates seamlessly with existing differentiable probabilistic programming systems. MC-EIF automates efficient statistical estimation for a broad class of models and target functionals that would previously require rigorous custom analysis. We prove that MC-EIF is consistent, and that estimators using MC-EIF achieve optimal $\sqrt{N}$ convergence rates. We show empirically that estimators using MC-EIF are at parity with estimators using analytic EIFs. Finally, we demonstrate a novel capstone example using MC-EIF for optimal portfolio selection.

Automated Efficient Estimation using Monte Carlo Efficient Influence Functions

TL;DR

It is proved that MC-EIF is consistent, and that estimators using MC-EIF achieve optimal convergence rates, and it is shown empirically that estimators using MC-EIF are at parity with estimators using analytic EIFs.

Abstract

Many practical problems involve estimating low dimensional statistical quantities with high-dimensional models and datasets. Several approaches address these estimation tasks based on the theory of influence functions, such as debiased/double ML or targeted minimum loss estimation. This paper introduces \textit{Monte Carlo Efficient Influence Functions} (MC-EIF), a fully automated technique for approximating efficient influence functions that integrates seamlessly with existing differentiable probabilistic programming systems. MC-EIF automates efficient statistical estimation for a broad class of models and target functionals that would previously require rigorous custom analysis. We prove that MC-EIF is consistent, and that estimators using MC-EIF achieve optimal convergence rates. We show empirically that estimators using MC-EIF are at parity with estimators using analytic EIFs. Finally, we demonstrate a novel capstone example using MC-EIF for optimal portfolio selection.
Paper Structure (33 sections, 5 theorems, 32 equations, 8 figures, 1 table, 3 algorithms)

This paper contains 33 sections, 5 theorems, 32 equations, 8 figures, 1 table, 3 algorithms.

Key Result

Theorem 3.4

(Theorem 3.5 in semi-theory-book) Suppose assum:cont_diff_prob, assum:cont_diff_func, and assum:fisher_inv hold. Then, the efficient influence function $\varphi_{\phi}(\tilde{x})$ at $\phi$ evaluated at the point $\tilde{x} \in \mathbb{R}^D$ equals

Figures (8)

  • Figure 1: Comparison between MC-EIF and empirical Gateaux approximation. MC-EIF (a and b) is less sensitive to hyperparameters parameters ($\epsilon$ and $\lambda$) than the empirical Gateaux baseline (c).
  • Figure 2: Empirical evidence for convergence theory. Increasing $p$ for the average treatment effect experiments produces MC-EIF approximation errors that closely match \ref{['thm:monte_eif_convg']}.
  • Figure 3: Comparison between ATE estimators using MC-EIF and analytic EIF. MC-EIF produces ATE estimates that are very close to the diagonal, representing an oracle estimator of the EIF.
  • Figure 4: We taxonomize the workflow of robust estimation into three stages: the derivation of an (approximate and/or efficient) influence function, the numerical derivation and analysis required for its computation, and the code required to compute it. For the analytic workflow, the derivation of the IF results in \ref{['eq:analytic_if_mpe']}. This largely involves terms already required by the original plug-in (\ref{['eq:mpe']}), but still must be implemented on a case-by-case basis in code. For the "Empirical Gateaux" workflow, the first stage requires only the general purpose \ref{['eq:emp_gat_if_approx']}, but demands case-specific numerical considerations and derivations like the one shown in \ref{['eq:emp_gat_mpe_numeric']}. In stark contrast, given a differentiable approximation to the functional of interest, $\textrm{MC-EIF}$ "automates" each stage through use of an end-to-end, general purpose solution.
  • Figure 5: Comparison of plug-in estimator and efficient estimators using MC-EIF and analytic EIF for estimating ATE. The true ATE is 0. Closer to zero the better. The distribution is over 100 simulated datasets. Dashed lines represent the estimates using the analytic EIF, and the solid lines represent using MC-EIF (when applicable).
  • ...and 3 more figures

Theorems & Definitions (12)

  • Definition 2.1
  • Definition 2.2
  • Theorem 3.4
  • Theorem 3.8
  • Proposition 4.1
  • Proposition 4.3
  • proof
  • proof
  • Lemma 1.1
  • proof
  • ...and 2 more