Table of Contents
Fetching ...

Replications, Revisions, and Reanalyses: Managing Variance Theories in Software Engineering

Julian Frattini, Jannik Fischbach, Davide Fucci, Michael Unterkalmsteiner, Daniel Mendez

TL;DR

Variance theories quantify the effect of independent variables on outcomes in software engineering, but current synthesis relies heavily on meta-analysis with limited handling of heterogeneity and nonexperimental evidence. The authors propose a formal framework that treats evidence as E(h,d,m) and models its evolution through replication, revision, and reanalysis, integrating causal inference and model comparison to assess internal and external validity. They demonstrate the approach in the requirements quality domain, using a version-control-like visualization to trace evidence, disentangle sub-steps, and identify where revisions were or were not properly validated. The framework aims to enable dynamic, prospective synthesis that guides future studies and improves practitioner decision support by producing more valid variance theories than static retrospective reviews.

Abstract

Variance theories quantify the variance that one or more independent variables cause in a dependent variable. In software engineering (SE), variance theories are used to quantify -- among others -- the impact of tools, techniques, and other treatments on software development outcomes. To acquire variance theories, evidence from individual empirical studies needs to be synthesized to more generally valid conclusions. However, research synthesis in SE is mostly limited to meta-analysis, which requires homogeneity of the synthesized studies to infer generalizable variance. In this paper, we aim to extend the practice of research synthesis beyond meta-analysis. To this end, we derive a conceptual framework for the evolution of variance theories and demonstrate its use by applying it to an active research field in SE. The resulting framework allows researchers to put new evidence in a clear relation to an existing body of knowledge and systematically expand the scientific frontier of a studied phenomenon.

Replications, Revisions, and Reanalyses: Managing Variance Theories in Software Engineering

TL;DR

Variance theories quantify the effect of independent variables on outcomes in software engineering, but current synthesis relies heavily on meta-analysis with limited handling of heterogeneity and nonexperimental evidence. The authors propose a formal framework that treats evidence as E(h,d,m) and models its evolution through replication, revision, and reanalysis, integrating causal inference and model comparison to assess internal and external validity. They demonstrate the approach in the requirements quality domain, using a version-control-like visualization to trace evidence, disentangle sub-steps, and identify where revisions were or were not properly validated. The framework aims to enable dynamic, prospective synthesis that guides future studies and improves practitioner decision support by producing more valid variance theories than static retrospective reviews.

Abstract

Variance theories quantify the variance that one or more independent variables cause in a dependent variable. In software engineering (SE), variance theories are used to quantify -- among others -- the impact of tools, techniques, and other treatments on software development outcomes. To acquire variance theories, evidence from individual empirical studies needs to be synthesized to more generally valid conclusions. However, research synthesis in SE is mostly limited to meta-analysis, which requires homogeneity of the synthesized studies to infer generalizable variance. In this paper, we aim to extend the practice of research synthesis beyond meta-analysis. To this end, we derive a conceptual framework for the evolution of variance theories and demonstrate its use by applying it to an active research field in SE. The resulting framework allows researchers to put new evidence in a clear relation to an existing body of knowledge and systematically expand the scientific frontier of a studied phenomenon.

Paper Structure

This paper contains 34 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: DAGs representing a hypothesis and three revisions
  • Figure 2: Framework describing the evolution of quantitative, empirical evidence
  • Figure 3: Hypothesis $h_1$ investigated by Femmer et al. femmer2014impact
  • Figure 4: Hypothesis $h_2$ revised by Frattini et al. frattini2024second
  • Figure 5: Hypothesis $h_3$ revised by Frattini et al. frattini2024applying
  • ...and 2 more figures