Leveraging Expert Consistency to Improve Algorithmic Decision Support
Maria De-Arteaga, Vincent Jeanselme, Artur Dubrawski, Alexandra Chouldechova
TL;DR
This paper tackles the construct gap between the decision criterion of interest $Y^c$ and proxies $Y$ and $D$ in high-stakes decision support. It proposes a two-stage strategy: (i) estimate expert consistency using influence functions when each case has a single expert decision and (ii) amalgamate labels so that the model learns from expert decisions in consistently assessed cases and from observed outcomes otherwise, producing $Y^{\mathcal{A}}$. The methodology is validated through semi-synthetic simulations and a real-world child welfare dataset, demonstrating improved predictive performance and a narrowed construct gap compared to learning from $Y$ or $D$ alone. The work offers a practical, robust approach for integrating expert decision history into ML decision-support systems while addressing non-random expert assignments and potential bias, with implications for policy and deployment in organizations that rely on archival expert decisions.
Abstract
Machine learning (ML) is increasingly being used to support high-stakes decisions. However, there is frequently a construct gap: a gap between the construct of interest to the decision-making task and what is captured in proxies used as labels to train ML models. As a result, ML models may fail to capture important dimensions of decision criteria, hampering their utility for decision support. Thus, an essential step in the design of ML systems for decision support is selecting a target label among available proxies. In this work, we explore the use of historical expert decisions as a rich -- yet also imperfect -- source of information that can be combined with observed outcomes to narrow the construct gap. We argue that managers and system designers may be interested in learning from experts in instances where they exhibit consistency with each other, while learning from observed outcomes otherwise. We develop a methodology to enable this goal using information that is commonly available in organizational information systems. This involves two core steps. First, we propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert. Second, we introduce a label amalgamation approach that allows ML models to simultaneously learn from expert decisions and observed outcomes. Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap, yielding better predictive performance than learning from either observed outcomes or expert decisions alone.
