Table of Contents
Fetching ...

A Second Look at the Impact of Passive Voice Requirements on Domain Modeling: Bayesian Reanalysis of an Experiment

Julian Frattini, Davide Fucci, Richard Torkar, Daniel Mendez

TL;DR

The paper addresses whether passive voice in natural-language requirements affects downstream domain modeling. It reanalyzes the only known controlled experiment using a causal-inference framework and Bayesian data analysis to preserve uncertainty and account for confounders. The reanalysis shows the original effects are far less significant, with passive voice not yielding a robust impact on missing actors, domain objects, or associations when uncertainty and causal structure are properly considered. These findings advocate for adopting Bayesian causal methods in SE research to produce more reliable, nuance-rich conclusions and improve empirical evidence in requirements engineering.

Abstract

The quality of requirements specifications may impact subsequent, dependent software engineering (SE) activities. However, empirical evidence of this impact remains scarce and too often superficial as studies abstract from the phenomena under investigation too much. Two of these abstractions are caused by the lack of frameworks for causal inference and frequentist methods which reduce complex data to binary results. In this study, we aim to demonstrate (1) the use of a causal framework and (2) contrast frequentist methods with more sophisticated Bayesian statistics for causal inference. To this end, we reanalyze the only known controlled experiment investigating the impact of passive voice on the subsequent activity of domain modeling. We follow a framework for statistical causal inference and employ Bayesian data analysis methods to re-investigate the hypotheses of the original study. Our results reveal that the effects observed by the original authors turned out to be much less significant than previously assumed. This study supports the recent call to action in SE research to adopt Bayesian data analysis, including causal frameworks and Bayesian statistics, for more sophisticated causal inference.

A Second Look at the Impact of Passive Voice Requirements on Domain Modeling: Bayesian Reanalysis of an Experiment

TL;DR

The paper addresses whether passive voice in natural-language requirements affects downstream domain modeling. It reanalyzes the only known controlled experiment using a causal-inference framework and Bayesian data analysis to preserve uncertainty and account for confounders. The reanalysis shows the original effects are far less significant, with passive voice not yielding a robust impact on missing actors, domain objects, or associations when uncertainty and causal structure are properly considered. These findings advocate for adopting Bayesian causal methods in SE research to produce more reliable, nuance-rich conclusions and improve empirical evidence in requirements engineering.

Abstract

The quality of requirements specifications may impact subsequent, dependent software engineering (SE) activities. However, empirical evidence of this impact remains scarce and too often superficial as studies abstract from the phenomena under investigation too much. Two of these abstractions are caused by the lack of frameworks for causal inference and frequentist methods which reduce complex data to binary results. In this study, we aim to demonstrate (1) the use of a causal framework and (2) contrast frequentist methods with more sophisticated Bayesian statistics for causal inference. To this end, we reanalyze the only known controlled experiment investigating the impact of passive voice on the subsequent activity of domain modeling. We follow a framework for statistical causal inference and employ Bayesian data analysis methods to re-investigate the hypotheses of the original study. Our results reveal that the effects observed by the original authors turned out to be much less significant than previously assumed. This study supports the recent call to action in SE research to adopt Bayesian data analysis, including causal frameworks and Bayesian statistics, for more sophisticated causal inference.
Paper Structure (25 sections, 5 figures, 2 tables)

This paper contains 25 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Domain model example
  • Figure 2: Full DAG visualizing the causal assumptions (red: exposure/main factor, turquoise: response/dependent variables)
  • Figure 3: Reduced DAG including all variables eligible for the regression model
  • Figure 4: Isolated impact of passive voice on the likelihood of missing an actor, object, or association ("assoc.")
  • Figure 5: Impact of the number of missing actors and objects on the likelihood of missing an association