Table of Contents
Fetching ...

Defining Expertise: Applications to Treatment Effect Estimation

Alihan Hüyük, Qiyao Wei, Alicia Curth, Mihaela van der Schaar

TL;DR

This work reframes treatment effect estimation by treating decision-maker expertise as an actionable inductive bias. It defines two expertise types—predictive and prognostic—based on how actions align with treatment effects or potential outcomes, and establishes a theoretical bound linking expertise to overlap via in-context action variability. Through synthetic simulations and benchmark comparisons, the authors show that the dominant type of expertise in a dataset significantly influences which CATE estimation method performs best, and propose an Expertise-informed pipeline to estimate expertise and adapt the estimator accordingly. The findings highlight the practical value of modeling expertise for improved model selection and estimation in domains with domain-driven decision-makers, such as healthcare or education.

Abstract

Decision-makers are often experts of their domain and take actions based on their domain knowledge. Doctors, for instance, may prescribe treatments by predicting the likely outcome of each available treatment. Actions of an expert thus naturally encode part of their domain knowledge, and can help make inferences within the same domain: Knowing doctors try to prescribe the best treatment for their patients, we can tell treatments prescribed more frequently are likely to be more effective. Yet in machine learning, the fact that most decision-makers are experts is often overlooked, and "expertise" is seldom leveraged as an inductive bias. This is especially true for the literature on treatment effect estimation, where often the only assumption made about actions is that of overlap. In this paper, we argue that expertise - particularly the type of expertise the decision-makers of a domain are likely to have - can be informative in designing and selecting methods for treatment effect estimation. We formally define two types of expertise, predictive and prognostic, and demonstrate empirically that: (i) the prominent type of expertise in a domain significantly influences the performance of different methods in treatment effect estimation, and (ii) it is possible to predict the type of expertise present in a dataset, which can provide a quantitative basis for model selection.

Defining Expertise: Applications to Treatment Effect Estimation

TL;DR

This work reframes treatment effect estimation by treating decision-maker expertise as an actionable inductive bias. It defines two expertise types—predictive and prognostic—based on how actions align with treatment effects or potential outcomes, and establishes a theoretical bound linking expertise to overlap via in-context action variability. Through synthetic simulations and benchmark comparisons, the authors show that the dominant type of expertise in a dataset significantly influences which CATE estimation method performs best, and propose an Expertise-informed pipeline to estimate expertise and adapt the estimator accordingly. The findings highlight the practical value of modeling expertise for improved model selection and estimation in domains with domain-driven decision-makers, such as healthcare or education.

Abstract

Decision-makers are often experts of their domain and take actions based on their domain knowledge. Doctors, for instance, may prescribe treatments by predicting the likely outcome of each available treatment. Actions of an expert thus naturally encode part of their domain knowledge, and can help make inferences within the same domain: Knowing doctors try to prescribe the best treatment for their patients, we can tell treatments prescribed more frequently are likely to be more effective. Yet in machine learning, the fact that most decision-makers are experts is often overlooked, and "expertise" is seldom leveraged as an inductive bias. This is especially true for the literature on treatment effect estimation, where often the only assumption made about actions is that of overlap. In this paper, we argue that expertise - particularly the type of expertise the decision-makers of a domain are likely to have - can be informative in designing and selecting methods for treatment effect estimation. We formally define two types of expertise, predictive and prognostic, and demonstrate empirically that: (i) the prominent type of expertise in a domain significantly influences the performance of different methods in treatment effect estimation, and (ii) it is possible to predict the type of expertise present in a dataset, which can provide a quantitative basis for model selection.
Paper Structure (34 sections, 4 theorems, 11 equations, 8 figures, 3 tables)

This paper contains 34 sections, 4 theorems, 11 equations, 8 figures, 3 tables.

Key Result

Proposition 1

For all $\pi\in\Delta(\{0,1\})^{\mathcal{X}}$,

Figures (8)

  • Figure 1: The higher the expertise of a policy, the lower its in-context action variability, hence its overlap, has to be (Prop. \ref{['prop:main']}). When expertise is high, leveraging it becomes critical as overlap would be low, making CATE estimation more challenging.
  • Figure 2: During our simulations, we increase the expertise: (i) by decreasing $\beta$ in $\pi_{\textit{soft}}$ (Best$\to$Expert), which also decreases the in-context action variability (i.e. the overlap), and (ii) by decreasing $d$ in $\pi_{\textit{mis}}$ (Worst$\to$Expert, as seen in Fig. \ref{['fig:trafficlights']}).
  • Figure 2: PEHE of various methods averaged across datasets from the Linear News environment.
  • Figure 3: As Best$\to$Expert (i.e. away from the "best case scenario"), treatment effect estimation gets generally harder and the performance of Baseline degrades. Similarly, as Worst$\to$Expert (i.e. away from the "worst case scenario") the performance of Baseline improves instead.
  • Figure 3: Performance improvements over Baseline. For predictive expertise, Action-predictive is able to improve more and more upon Baseline by exploiting the increasing expertise (both as Best$\to$Expert and Worst$\to$Expert). In contrast, Balancing gets worse with increasing expertise since the information it discards about the policy becomes more correlated with the treatment effects. However, for prognostic expertise, we observe that having non-predictive variables determine actions can help $\textit{Balancing}$ select against those variables when forming representations, improving performance upon Baseline---in most configurations as opposed to Action-predictive.
  • ...and 3 more figures

Theorems & Definitions (8)

  • Definition 1: Prognostic expertise
  • Definition 2: Predictive expertise
  • Definition 3: Perfect prognostic/predictive expert
  • Definition 4: In-context action variability
  • Proposition 1: Boundedness of expertise and in-context action variability
  • Proposition 2: Determinism of perfect experts
  • Proposition 3
  • Proposition 4