How Dataflow Diagrams Impact Software Security Analysis: an Empirical Experiment
Simon Schneider, Nicolás E. Díaz Ferreyra, Pierre-Jean Quéval, Georg Simhandl, Uwe Zdun, Riccardo Scandariato
TL;DR
This study empirically evaluates how security-annotated Dataflow Diagrams (DFDs) affect software security analysis performance in a microservice context. Using a within-subjects design with two similar Java apps, it shows a significant 41% improvement in analysis correctness when DFDs are provided, and a 315% boost in evidence correctness when traceability is used. The results also reveal nuanced, task-dependent effects and identify three open challenges (understandability, presentation of missing features, and traceability accessibility). Overall, the work supports adopting model-based, security-annotated architectures while highlighting practical considerations for integrating DFDs into security workflows. This informs practitioners and researchers about the potential benefits and limitations of DFDs in real-world security analyses and certification contexts.
Abstract
Models of software systems are used throughout the software development lifecycle. Dataflow diagrams (DFDs), in particular, are well-established resources for security analysis. Many techniques, such as threat modelling, are based on DFDs of the analysed application. However, their impact on the performance of analysts in a security analysis setting has not been explored before. In this paper, we present the findings of an empirical experiment conducted to investigate this effect. Following a within-groups design, participants were asked to solve security-relevant tasks for a given microservice application. In the control condition, the participants had to examine the source code manually. In the model-supported condition, they were additionally provided a DFD of the analysed application and traceability information linking model items to artefacts in source code. We found that the participants (n = 24) performed significantly better in answering the analysis tasks correctly in the model-supported condition (41% increase in analysis correctness). Further, participants who reported using the provided traceability information performed better in giving evidence for their answers (315% increase in correctness of evidence). Finally, we identified three open challenges of using DFDs for security analysis based on the insights gained in the experiment.
