A Dataset of Performance Measurements and Alerts from Mozilla (Data Artifact)
Mohamed Bilel Besbes, Diego Elias Costa, Suhaib Mujahid, Gregory Mierzwinski, Marco Castelluccio
TL;DR
This work addresses the lack of publicly available, richly annotated performance datasets capturing time series, expert-validated alerts, and associated metadata such as bugs. It presents a Mozilla Firefox-based dataset containing 5,655 performance time series, 17,989 alerts, and 482 linked bugs collected over one year, with cross-referenced alert summaries and manual validations by Performance Sheriffs. The dataset supports research in performance engineering, anomaly detection, and machine learning by enabling change-point detection and regression analysis across diverse platforms and test suites. The authors also provide scripts and data organization to facilitate reproducibility and extension with newer measurements. This resource has practical impact for researchers and industry practitioners aiming to detect and remediate performance regressions earlier in the software lifecycle.
Abstract
Performance regressions in software systems can lead to significant financial losses and degraded user satisfaction, making their early detection and mitigation critical. Despite the importance of practices that capture performance regressions early, there is a lack of publicly available datasets that comprehensively capture real-world performance measurements, expert-validated alerts, and associated metadata such as bugs and testing conditions. To address this gap, we introduce a unique dataset to support various research studies in performance engineering, anomaly detection, and machine learning. This dataset was collected from Mozilla Firefox's performance testing infrastructure and comprises 5,655 performance time series, 17,989 performance alerts, and detailed annotations of resulting bugs collected from May 2023 to May 2024. By publishing this dataset, we provide researchers with an invaluable resource for studying performance trends, developing novel change point detection methods, and advancing performance regression analysis across diverse platforms and testing environments. The dataset is available at https://doi.org/10.5281/zenodo.14642238
