Tutorial Debriefing: Applied Statistical Causal Inference in Requirements Engineering
Julian Frattini, Hans-Martin Heyn, Robert Feldt, Richard Torkar
TL;DR
The paper tackles the difficulty of producing causal evidence in SE when randomized experiments are impractical, advocating statistical causal inference (SCI) grounded in causal DAGs to transparently encode assumptions and identify necessary controls. By examining relationships such as $x \rightarrow y$, $x \leftarrow z \rightarrow y$, and $x \rightarrow z \rightarrow y$, it shows how to distinguish confounding, mediation, and collider bias, and proposes a three-step workflow—Modeling, Identification, and Estimation—to obtain debiased causal effects. The contribution includes a framework for practicing SCI in SE, guidance on data-collection and bias mitigation, and practical tooling recommendations (e.g., DAGitty, GGDag, brms) with open-source tutorial materials, aiming to strengthen causal reasoning in Requirements Engineering. The paper also documents the tutorial lineage and advocates community engagement to accelerate SCI adoption in SE/RE.
Abstract
As any scientific discipline, the software engineering (SE) research community strives to contribute to the betterment of the target population of our research: software producers and consumers. We will only achieve this betterment if we manage to transfer the knowledge acquired during research into practice. This transferal of knowledge may come in the form of tools, processes, and guidelines for software developers. However, the value of these contributions hinges on the assumption that applying them causes an improvement of the development process, user experience, or other performance metrics. Such a promise requires evidence of causal relationships between an exposure or intervention (i.e., the contributed tool, process or guideline) and an outcome (i.e., performance metrics). A straight-forward approach to obtaining this evidence is via controlled experiments in which a sample of a population is randomly divided into a group exposed to the new tool, process, or guideline, and a control group. However, such randomized control trials may not be legally, ethically, or logistically feasible. In these cases, we need a reliable process for statistical causal inference (SCI) from observational data.
