Table of Contents
Fetching ...

Detecting Performance-Relevant Changes in Configurable Software Systems

Sebastian Böhm, Florian Sattler, Norbert Siegmund, Sven Apel

TL;DR

ConfFLARE tackles the costly problem of performance regression testing in configurable software by static, region-based analysis that detects data-flow interactions between code changes and performance-relevant code. It introduces commit and feature code regions, leverages an ESG-driven taint analysis to identify performance-relevant interactions, and extracts related features to constrain the configuration space for testing. Across synthetic, seeded, and real-world scenarios, ConfFLARE achieves high recall in detecting regressions and substantially reduces the number of configurations and total testing time required, albeit with some limitations in handling implicit flows and function-pointer cases. The results demonstrate substantial practical value and show how ConfFLARE can complement existing sampling and modeling approaches to enable more efficient performance regression workflows in configurable software systems.

Abstract

Performance is a volatile property of a software system and frequent performance profiling is required to keep the knowledge about a software system's performance behavior up to date. Repeating all performance measurements after every revision is a cost-intensive task, especially in the presence of configurability, where one has to measure multiple configurations to obtain a comprehensive picture. Configuration sampling is a common approach to control the measurement cost. However, it cannot guarantee completeness and might miss performance regressions, especially if they only affect few configurations. As an alternative to solve the cost reduction problem, we present ConfFLARE: ConfFLARE estimates whether a change potentially impacts performance by identifying data-flow interactions with performance-relevant code and extracts which software features participate in such interactions. Based on these features, we can select a subset of relevant configurations to focus performance profiling efforts on. In a study conducted on both, synthetic and real-world software systems, ConfFLARE correctly detects performance regressions in almost all cases and identifies relevant features in all but two cases, reducing the number of configurations to be tested on average by $79\%$ for synthetic and by $70\%$ for real-world regression scenarios saving hours of performance testing time.

Detecting Performance-Relevant Changes in Configurable Software Systems

TL;DR

ConfFLARE tackles the costly problem of performance regression testing in configurable software by static, region-based analysis that detects data-flow interactions between code changes and performance-relevant code. It introduces commit and feature code regions, leverages an ESG-driven taint analysis to identify performance-relevant interactions, and extracts related features to constrain the configuration space for testing. Across synthetic, seeded, and real-world scenarios, ConfFLARE achieves high recall in detecting regressions and substantially reduces the number of configurations and total testing time required, albeit with some limitations in handling implicit flows and function-pointer cases. The results demonstrate substantial practical value and show how ConfFLARE can complement existing sampling and modeling approaches to enable more efficient performance regression workflows in configurable software systems.

Abstract

Performance is a volatile property of a software system and frequent performance profiling is required to keep the knowledge about a software system's performance behavior up to date. Repeating all performance measurements after every revision is a cost-intensive task, especially in the presence of configurability, where one has to measure multiple configurations to obtain a comprehensive picture. Configuration sampling is a common approach to control the measurement cost. However, it cannot guarantee completeness and might miss performance regressions, especially if they only affect few configurations. As an alternative to solve the cost reduction problem, we present ConfFLARE: ConfFLARE estimates whether a change potentially impacts performance by identifying data-flow interactions with performance-relevant code and extracts which software features participate in such interactions. Based on these features, we can select a subset of relevant configurations to focus performance profiling efforts on. In a study conducted on both, synthetic and real-world software systems, ConfFLARE correctly detects performance regressions in almost all cases and identifies relevant features in all but two cases, reducing the number of configurations to be tested on average by for synthetic and by for real-world regression scenarios saving hours of performance testing time.

Paper Structure

This paper contains 50 sections, 6 equations, 10 figures, 5 tables, 1 algorithm.

Figures (10)

  • Figure 1: Example of a performance-relevant code change. Function contains performance-relevant code (). The callout points to a change that was made to Line 2 ( ). This change has an influence on the performance behavior of region since it changes how the loop counter is calculated. The example contains two features () named StrongCompression and Verbosity, controlled by the parameters and , respectively. ConfFLARE reveals that only StrongCompression affects the example's performance by influencing the value of .
  • Figure 2: Continuation of our running example from \ref{['fig:intro_example']}. The code on the left is identical to \ref{['fig:intro_example']}. On the right is the corresponding control-flow graph with line numbers as nodes and dashed arrows showing control-flow. The thick black arrows denote data flow between the change and performance-relevant code passing through StrongCompression and Verbosity.
  • Figure 3: Simplified ESG for our running example. For each function, we show its CFG with line numbers as nodes on the left and a simplified version of its ESG on the right that only contains nodes relevant for this example. Function calls are split into separate call and return nodes (suffixes c and r). The loop header in Line 10 is split into three nodes: variable declaration (10d), loop condition (10c), and increment (10i). The highlighted arrows show the result of the path reconstruction for the performance-relevant interaction between the change in Line 2 and the performance-relevant code region in Lines 10 and 11.
  • Figure 4: Simplified code examples of the Inter scenarios demonstrating different interaction patterns between features and performance-relevant code. Function is performance-relevant () and code controlled by variable is feature code (). Solid lines represent data flow, the dotted line represents implicit flow.
  • Figure 5: Pairs of consecutive revisions of each subject system constitute the set of regression scenarios. For each scenario, we establish a ground truth by performing performance measurements for all configurations. Aditionally, we obtain a regression classification and the related features/configurations from our performance interaction analysis. We answer our research questions by comparing these results to the ground truth.
  • ...and 5 more figures

Theorems & Definitions (6)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6