Table of Contents
Fetching ...

Decision Predicate Graphs: Enhancing Interpretability in Tree Ensembles

Leonardo Arrighi, Luca Pennella, Gabriel Marques Tavares, Sylvio Barbon Junior

TL;DR

This paper proposes Decision Predicate Graphs (DPG), a model-agnostic graph-based tool to obtain global interpretability for tree-based ensembles by representing predicates (feature–value tests) as nodes and their co-occurrence frequencies as weighted edges. It formalizes the construction of DPG from a trained ensemble, provides a complexity analysis ($O(b \times s \times k^2)$) and pseudo-code, and introduces graph-theoretic metrics—betweenness centrality, local reaching centrality, and community detection—to quantify decision importance and class structure. Through Iris and a synthetic multiclass dataset, DPG demonstrates how these metrics reveal influential predicates, classify decision pathways, and identify class-specific communities, offering insights beyond traditional visualisations. The work compares DPG with ADD-based graph representations, highlighting advantages in weighting, global metrics, and scalability, and outlines potential improvements and extensions to regression problems, broader datasets, and additional interpretability tools. Overall, DPG enhances global interpretability of tree ensembles by integrating graph theory with predicate-path analysis, providing actionable insights while preserving model performance.

Abstract

Understanding the decisions of tree-based ensembles and their relationships is pivotal for machine learning model interpretation. Recent attempts to mitigate the human-in-the-loop interpretation challenge have explored the extraction of the decision structure underlying the model taking advantage of graph simplification and path emphasis. However, while these efforts enhance the visualisation experience, they may either result in a visually complex representation or compromise the interpretability of the original ensemble model. In addressing this challenge, especially in complex scenarios, we introduce the Decision Predicate Graph (DPG) as a model-agnostic tool to provide a global interpretation of the model. DPG is a graph structure that captures the tree-based ensemble model and learned dataset details, preserving the relations among features, logical decisions, and predictions towards emphasising insightful points. Leveraging well-known graph theory concepts, such as the notions of centrality and community, DPG offers additional quantitative insights into the model, complementing visualisation techniques, expanding the problem space descriptions, and offering diverse possibilities for extensions. Empirical experiments demonstrate the potential of DPG in addressing traditional benchmarks and complex classification scenarios.

Decision Predicate Graphs: Enhancing Interpretability in Tree Ensembles

TL;DR

This paper proposes Decision Predicate Graphs (DPG), a model-agnostic graph-based tool to obtain global interpretability for tree-based ensembles by representing predicates (feature–value tests) as nodes and their co-occurrence frequencies as weighted edges. It formalizes the construction of DPG from a trained ensemble, provides a complexity analysis () and pseudo-code, and introduces graph-theoretic metrics—betweenness centrality, local reaching centrality, and community detection—to quantify decision importance and class structure. Through Iris and a synthetic multiclass dataset, DPG demonstrates how these metrics reveal influential predicates, classify decision pathways, and identify class-specific communities, offering insights beyond traditional visualisations. The work compares DPG with ADD-based graph representations, highlighting advantages in weighting, global metrics, and scalability, and outlines potential improvements and extensions to regression problems, broader datasets, and additional interpretability tools. Overall, DPG enhances global interpretability of tree ensembles by integrating graph theory with predicate-path analysis, providing actionable insights while preserving model performance.

Abstract

Understanding the decisions of tree-based ensembles and their relationships is pivotal for machine learning model interpretation. Recent attempts to mitigate the human-in-the-loop interpretation challenge have explored the extraction of the decision structure underlying the model taking advantage of graph simplification and path emphasis. However, while these efforts enhance the visualisation experience, they may either result in a visually complex representation or compromise the interpretability of the original ensemble model. In addressing this challenge, especially in complex scenarios, we introduce the Decision Predicate Graph (DPG) as a model-agnostic tool to provide a global interpretation of the model. DPG is a graph structure that captures the tree-based ensemble model and learned dataset details, preserving the relations among features, logical decisions, and predictions towards emphasising insightful points. Leveraging well-known graph theory concepts, such as the notions of centrality and community, DPG offers additional quantitative insights into the model, complementing visualisation techniques, expanding the problem space descriptions, and offering diverse possibilities for extensions. Empirical experiments demonstrate the potential of DPG in addressing traditional benchmarks and complex classification scenarios.
Paper Structure (14 sections, 2 equations, 3 figures, 10 tables, 1 algorithm)

This paper contains 14 sections, 2 equations, 3 figures, 10 tables, 1 algorithm.

Figures (3)

  • Figure 1: DPG of the RF composed of 5.0 tree base learners trained on Iris dataset.
  • Figure 2: Two-dimensional depiction of the Iris dataset, employing feature pairs in each graph for visual representation.
  • Figure 3: ADD of an RF model with 5.0 tree base learners induced for Iris dataset.