Table of Contents
Fetching ...

ChronoGraph: A Real-World Graph-Based Multivariate Time Series Dataset

Adrian Catalin Lutu, Ioana Pintilie, Elena Burceanu, Andrei Manolache

TL;DR

The paper introduces ChronoGraph, a real-world graph-structured multivariate time series dataset derived from production microservices, enhanced with explicit topology and incident annotations. It provides six months of telemetry for about 708 services, with per-service five metrics and inter-service eight-signal edges, at a 30-minute cadence, plus 17 labeled disruption windows. The authors benchmark forecasting models and anomaly detectors, highlighting that short-horizon forecasting is achievable but long-horizon performance degrades and that standard methods struggle to exploit the graph structure. ChronoGraph is proposed as a foundation for developing and evaluating graph-aware forecasting and incident-aware anomaly detection in microservice environments.

Abstract

We present ChronoGraph, a graph-structured multivariate time series forecasting dataset built from real-world production microservices. Each node is a service that emits a multivariate stream of system-level performance metrics, capturing CPU, memory, and network usage patterns, while directed edges encode dependencies between services. The primary task is forecasting future values of these signals at the service level. In addition, ChronoGraph provides expert-annotated incident windows as anomaly labels, enabling evaluation of anomaly detection methods and assessment of forecast robustness during operational disruptions. Compared to existing benchmarks from industrial control systems or traffic and air-quality domains, ChronoGraph uniquely combines (i) multivariate time series, (ii) an explicit, machine-readable dependency graph, and (iii) anomaly labels aligned with real incidents. We report baseline results spanning forecasting models, pretrained time-series foundation models, and standard anomaly detectors. ChronoGraph offers a realistic benchmark for studying structure-aware forecasting and incident-aware evaluation in microservice systems.

ChronoGraph: A Real-World Graph-Based Multivariate Time Series Dataset

TL;DR

The paper introduces ChronoGraph, a real-world graph-structured multivariate time series dataset derived from production microservices, enhanced with explicit topology and incident annotations. It provides six months of telemetry for about 708 services, with per-service five metrics and inter-service eight-signal edges, at a 30-minute cadence, plus 17 labeled disruption windows. The authors benchmark forecasting models and anomaly detectors, highlighting that short-horizon forecasting is achievable but long-horizon performance degrades and that standard methods struggle to exploit the graph structure. ChronoGraph is proposed as a foundation for developing and evaluating graph-aware forecasting and incident-aware anomaly detection in microservice environments.

Abstract

We present ChronoGraph, a graph-structured multivariate time series forecasting dataset built from real-world production microservices. Each node is a service that emits a multivariate stream of system-level performance metrics, capturing CPU, memory, and network usage patterns, while directed edges encode dependencies between services. The primary task is forecasting future values of these signals at the service level. In addition, ChronoGraph provides expert-annotated incident windows as anomaly labels, enabling evaluation of anomaly detection methods and assessment of forecast robustness during operational disruptions. Compared to existing benchmarks from industrial control systems or traffic and air-quality domains, ChronoGraph uniquely combines (i) multivariate time series, (ii) an explicit, machine-readable dependency graph, and (iii) anomaly labels aligned with real incidents. We report baseline results spanning forecasting models, pretrained time-series foundation models, and standard anomaly detectors. ChronoGraph offers a realistic benchmark for studying structure-aware forecasting and incident-aware evaluation in microservice systems.

Paper Structure

This paper contains 18 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: System architecture at a given time. Nodes represent microservices, and edges indicate inter-service communication. Ground-truth disruptions (red) and high-confidence predictions from our ensemble (orange) are overlaid. The spatial clustering of anomalies in densely connected regions suggests that abnormal behavior tends to propagate along the underlying graph topology.
  • Figure 2: Chronos forecasts on two metrics of the same microservice during a disruption. Left: predictions drift when the anomaly occurs (red band). Right: the model captures only a periodic baseline, missing the bursty variability of network traffic.
  • Figure 3: Chronos forecasts on two metrics of a microservice. Left: network traffic is accurate at first but then drifts upward even without an anomaly, illustrating the limited stability of long-horizon predictions. Right: CPU usage is forecast accurately across the full test time window.
  • Figure 4: Comparison of Prophet model and ensemble anomaly detection outputs across multiple container metrics
  • Figure 5: Comparison of anomaly frequency distributions.
  • ...and 2 more figures