Table of Contents
Fetching ...

BIPeC: A Combined Change-Point Analyzer to Identify Performance Regressions in Large-scale Database Systems

Zhan Lyu, Thomas Bach, Yong Li, Nguyen Minh Le, Lars Hoemke

TL;DR

This work tackles automated detection of performance regressions in SAP HANA by analyzing large-scale time-series performance metrics. It introduces BIPeC, a framework that couples Bayesian change-point detection with the Pruned Exact Linear Time (PELT) algorithm, augmented by preprocessing, a stepwise detection-refinement pipeline, and a feedback loop for continuous improvement. The approach uses a Bayes factor $B = \frac{P(D|H_2)}{P(D|H_1)}$ with Poisson likelihoods and MCMC-based integration, alongside a PELT objective $C(D,n) + \beta n$ with an RBF-based cost, to accurately identify change points. Empirical results on public datasets and SAP HANA BMDB show that BIPeC delivers higher precision and F1 scores than traditional CPD methods, demonstrating improved accuracy, scalability, and practical impact for proactive performance management in large-scale databases.

Abstract

Performance testing in large-scale database systems like SAP HANA is a crucial yet labor-intensive task, involving extensive manual analysis of thousands of measurements, such as CPU time and elapsed time. Manual maintenance of these metrics is time-consuming and susceptible to human error, making early detection of performance regressions challenging. We address these issues by proposing an automated approach to detect performance regressions in such measurements. Our approach integrates Bayesian inference with the Pruned Exact Linear Time (PELT) algorithm, enhancing the detection of change points and performance regressions with high precision and efficiency compared to previous approaches. Our method minimizes false negatives and ensures SAP HANA's system's reliability and performance quality. The proposed solution can accelerate testing and contribute to more sustainable performance management practices in large-scale data management environments.

BIPeC: A Combined Change-Point Analyzer to Identify Performance Regressions in Large-scale Database Systems

TL;DR

This work tackles automated detection of performance regressions in SAP HANA by analyzing large-scale time-series performance metrics. It introduces BIPeC, a framework that couples Bayesian change-point detection with the Pruned Exact Linear Time (PELT) algorithm, augmented by preprocessing, a stepwise detection-refinement pipeline, and a feedback loop for continuous improvement. The approach uses a Bayes factor with Poisson likelihoods and MCMC-based integration, alongside a PELT objective with an RBF-based cost, to accurately identify change points. Empirical results on public datasets and SAP HANA BMDB show that BIPeC delivers higher precision and F1 scores than traditional CPD methods, demonstrating improved accuracy, scalability, and practical impact for proactive performance management in large-scale databases.

Abstract

Performance testing in large-scale database systems like SAP HANA is a crucial yet labor-intensive task, involving extensive manual analysis of thousands of measurements, such as CPU time and elapsed time. Manual maintenance of these metrics is time-consuming and susceptible to human error, making early detection of performance regressions challenging. We address these issues by proposing an automated approach to detect performance regressions in such measurements. Our approach integrates Bayesian inference with the Pruned Exact Linear Time (PELT) algorithm, enhancing the detection of change points and performance regressions with high precision and efficiency compared to previous approaches. Our method minimizes false negatives and ensures SAP HANA's system's reliability and performance quality. The proposed solution can accelerate testing and contribute to more sustainable performance management practices in large-scale data management environments.
Paper Structure (19 sections, 4 equations, 10 figures, 1 table, 2 algorithms)

This paper contains 19 sections, 4 equations, 10 figures, 1 table, 2 algorithms.

Figures (10)

  • Figure 1: Typology of the methods described.
  • Figure 2: The architecture of BIPeC.
  • Figure 3: Bayesian algorithm's outcome.
  • Figure 4: Pelt algorithm's outcome.
  • Figure 5: Feedback loop system.
  • ...and 5 more figures