The PetShop Dataset -- Finding Causes of Performance Issues across Microservices
Michaela Hardt, William R. Orchard, Patrick Blöbaum, Shiva Kasiviswanathan, Elke Kirschbaum
TL;DR
The paper introduces the PetShop Dataset as a standardized benchmark for root-cause analysis in microservice-based applications, addressing the lack of public datasets for quantitative RCA evaluation. It provides 41 components, 68 injected issues, and ground-truth root causes across latency, requests, and availability measured every 5 minutes, with train/test splits and an AWS-X-Ray-based service map. By benchmarking a range of RCA methods across causal and non-causal formulations, the authors show that access to a causal graph improves performance, but SCM-based approaches struggle when data are limited and simple correlation baselines remain competitive. The dataset, accompanying tooling, and evaluation protocol are publicly available to accelerate robust RCA research and foster community contributions across different applications.
Abstract
Identifying root causes for unexpected or undesirable behavior in complex systems is a prevalent challenge. This issue becomes especially crucial in modern cloud applications that employ numerous microservices. Although the machine learning and systems research communities have proposed various techniques to tackle this problem, there is currently a lack of standardized datasets for quantitative benchmarking. Consequently, research groups are compelled to create their own datasets for experimentation. This paper introduces a dataset specifically designed for evaluating root cause analyses in microservice-based applications. The dataset encompasses latency, requests, and availability metrics emitted in 5-minute intervals from a distributed application. In addition to normal operation metrics, the dataset includes 68 injected performance issues, which increase latency and reduce availability throughout the system. We showcase how this dataset can be used to evaluate the accuracy of a variety of methods spanning different causal and non-causal characterisations of the root cause analysis problem. We hope the new dataset, available at https://github.com/amazon-science/petshop-root-cause-analysis/ enables further development of techniques in this important area.
