Table of Contents
Fetching ...

ProRCA: A Causal Python Package for Actionable Root Cause Analysis in Real-world Business Scenarios

Ahmed Dawoud, Shravan Talupula

TL;DR

ProRCA addresses the challenge of explaining complex failures in large, interdependent systems by extending the DoWhy causal inference framework to reconstruct complete multi-hop root-cause pathways. It introduces a Combined Node Score, a DFS-based pathway discovery, and a path-significance metric to rank candidate root-causes, enabling end-to-end causal tracing from observed anomalies to initial triggers. The method is validated on synthetic retail data with four injected anomalies, demonstrating accurate pathway identification and interpretable node-level statistics that reveal how upstream factors contribute to downstream effects. This approach offers robust, actionable diagnostics for operational reliability and fault prevention in data-rich environments, with practical implications for real-world RCA workflows.

Abstract

Root Cause Analysis (RCA) is becoming ever more critical as modern systems grow in complexity, volume of data, and interdependencies. While traditional RCA methods frequently rely on correlation-based or rule-based techniques, these approaches can prove inadequate in highly dynamic, multi-layered environments. In this paper, we present a pathway-tracing package built on the DoWhy causal inference library. Our method integrates conditional anomaly scoring, noise-based attribution, and depth-first path exploration to reveal multi-hop causal chains. By systematically tracing entire causal pathways from an observed anomaly back to the initial triggers, our approach provides a comprehensive, end-to-end RCA solution. Experimental evaluations with synthetic anomaly injections demonstrate the package's ability to accurately isolate triggers and rank root causes by their overall significance.

ProRCA: A Causal Python Package for Actionable Root Cause Analysis in Real-world Business Scenarios

TL;DR

ProRCA addresses the challenge of explaining complex failures in large, interdependent systems by extending the DoWhy causal inference framework to reconstruct complete multi-hop root-cause pathways. It introduces a Combined Node Score, a DFS-based pathway discovery, and a path-significance metric to rank candidate root-causes, enabling end-to-end causal tracing from observed anomalies to initial triggers. The method is validated on synthetic retail data with four injected anomalies, demonstrating accurate pathway identification and interpretable node-level statistics that reveal how upstream factors contribute to downstream effects. This approach offers robust, actionable diagnostics for operational reliability and fault prevention in data-rich environments, with practical implications for real-world RCA workflows.

Abstract

Root Cause Analysis (RCA) is becoming ever more critical as modern systems grow in complexity, volume of data, and interdependencies. While traditional RCA methods frequently rely on correlation-based or rule-based techniques, these approaches can prove inadequate in highly dynamic, multi-layered environments. In this paper, we present a pathway-tracing package built on the DoWhy causal inference library. Our method integrates conditional anomaly scoring, noise-based attribution, and depth-first path exploration to reveal multi-hop causal chains. By systematically tracing entire causal pathways from an observed anomaly back to the initial triggers, our approach provides a comprehensive, end-to-end RCA solution. Experimental evaluations with synthetic anomaly injections demonstrate the package's ability to accurately isolate triggers and rank root causes by their overall significance.

Paper Structure

This paper contains 31 sections, 7 equations.