Table of Contents
Fetching ...

Hierarchical confusion matrix for classification performance evaluation

Kevin Riehl, Michael Neunteufel, Martin Hemberg

TL;DR

The paper introduces a hierarchical confusion matrix to adapt traditional binary evaluation metrics to hierarchical classification, enabling evaluation that respects hierarchical structure and path-level correctness. It generalizes the approach to trees and DAGs, and to single-path and multi-path labeling with optional leaf-node prediction, unifying evaluation across diverse hierarchical problems. Through three real-world benchmarks, the authors show that hierarchical-confusion-based measures yield meaningful, interpretable rankings and reveal differences from conventional metrics in structure-sensitive settings. The work provides a practical, open-source implementation to facilitate standardized evaluation of hierarchical classifiers across domains.

Abstract

In this work we propose a novel concept of a hierarchical confusion matrix, opening the door for popular confusion matrix based (flat) evaluation measures from binary classification problems, while considering the peculiarities of hierarchical classification problems. We develop the concept to a generalized form and prove its applicability to all types of hierarchical classification problems including directed acyclic graphs, multi path labelling, and non mandatory leaf node prediction. Finally, we use measures based on the novel confusion matrix to evaluate models within a benchmark for three real world hierarchical classification applications and compare the results to established evaluation measures. The results outline the reasonability of this approach and its usefulness to evaluate hierarchical classification problems. The implementation of hierarchical confusion matrix is available on GitHub.

Hierarchical confusion matrix for classification performance evaluation

TL;DR

The paper introduces a hierarchical confusion matrix to adapt traditional binary evaluation metrics to hierarchical classification, enabling evaluation that respects hierarchical structure and path-level correctness. It generalizes the approach to trees and DAGs, and to single-path and multi-path labeling with optional leaf-node prediction, unifying evaluation across diverse hierarchical problems. Through three real-world benchmarks, the authors show that hierarchical-confusion-based measures yield meaningful, interpretable rankings and reveal differences from conventional metrics in structure-sensitive settings. The work provides a practical, open-source implementation to facilitate standardized evaluation of hierarchical classifiers across domains.

Abstract

In this work we propose a novel concept of a hierarchical confusion matrix, opening the door for popular confusion matrix based (flat) evaluation measures from binary classification problems, while considering the peculiarities of hierarchical classification problems. We develop the concept to a generalized form and prove its applicability to all types of hierarchical classification problems including directed acyclic graphs, multi path labelling, and non mandatory leaf node prediction. Finally, we use measures based on the novel confusion matrix to evaluate models within a benchmark for three real world hierarchical classification applications and compare the results to established evaluation measures. The results outline the reasonability of this approach and its usefulness to evaluate hierarchical classification problems. The implementation of hierarchical confusion matrix is available on GitHub.
Paper Structure (13 sections, 7 equations, 4 figures, 5 tables)

This paper contains 13 sections, 7 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Classification problems by example. The selected class is marked gray. (a) binary classification. (b) multi-class classification. (c) multi-label classification. (d) hierarchical classification. Note, that the single selected category (dark chocolate) causes the selection of multiple parental vertices in the graph on a path up to the root node.
  • Figure 2: Structures of hierarchical classification. (a) trees. (b) directed acyclic graphs (DAG).
  • Figure 3: Approaches to hierarchical classification. The use of classifiers is marked with dashed lines. (a) transforms leaf node classification to a multi-class problem. (b) considers all classes at the same time. (c) employs a binary classifier for each vertex of the structure. (d) utilizes a multi-class classifier for each parental node. (e) uses a multi-class classifier at each level of the taxonomy.
  • Figure 4: Four exemplary predictions and their evaluation using the concept of hierarchical confusion. The true class(es) is (are) coloured gray and marked with a cross. The predictions are marked with numbered arrows and prediction paths are marked with an dotted resp. dashed lines. The tables right to the diagrams present the resulting numbers for the confusion matrix fields and the relevant nodes that were considered in the specific context.