Table of Contents
Fetching ...

On the Normalization of Confusion Matrices: Methods and Geometric Interpretations

Johan Erbani, Pierre-Edouard Portier, Elod Egyed-Zsigmond, Sonia Ben Mokhtar, Diana Nurbakova

Abstract

The confusion matrix is a standard tool for evaluating classifiers by providing insights into class-level errors. In heterogeneous settings, its values are shaped by two main factors: class similarity -- how easily the model confuses two classes -- and distribution bias, arising from skewed distributions in the training and test sets. However, confusion matrix values reflect a mix of both factors, making it difficult to disentangle their individual contributions. To address this, we introduce bistochastic normalization using Iterative Proportional Fitting, a generalization of row and column normalization. Unlike standard normalizations, this method recovers the underlying structure of class similarity. By disentangling error sources, it enables more accurate diagnosis of model behavior and supports more targeted improvements. We also show a correspondence between confusion matrix normalizations and the model's internal class representations. Both standard and bistochastic normalizations can be interpreted geometrically in this space, offering a deeper understanding of what normalization reveals about a classifier.

On the Normalization of Confusion Matrices: Methods and Geometric Interpretations

Abstract

The confusion matrix is a standard tool for evaluating classifiers by providing insights into class-level errors. In heterogeneous settings, its values are shaped by two main factors: class similarity -- how easily the model confuses two classes -- and distribution bias, arising from skewed distributions in the training and test sets. However, confusion matrix values reflect a mix of both factors, making it difficult to disentangle their individual contributions. To address this, we introduce bistochastic normalization using Iterative Proportional Fitting, a generalization of row and column normalization. Unlike standard normalizations, this method recovers the underlying structure of class similarity. By disentangling error sources, it enables more accurate diagnosis of model behavior and supports more targeted improvements. We also show a correspondence between confusion matrix normalizations and the model's internal class representations. Both standard and bistochastic normalizations can be interpreted geometrically in this space, offering a deeper understanding of what normalization reveals about a classifier.

Paper Structure

This paper contains 39 sections, 4 theorems, 51 equations, 6 figures, 2 algorithms.

Key Result

Proposition 1

$\operatorname{bi}(M)$ satisfies idempotence, class distribution invariance, and information preservation as described in Subsection Informal Definition and Desirable Properties.

Figures (6)

  • Figure 1: How class similarity and distribution bias affect confusion matrices. The first three subfigures isolate each factor, while the last combines them: (a) Test set imbalance, with B as the majority class and C as the minority; (b) Prediction imbalance, with C over-predicted and B under-predicted; (c) Class similarities; (d) Combined influence of all factors. Darker colors indicate higher values.
  • Figure 2: Toy example showing a 2D projection of embedded observations forming class clusters (top map). Each cluster is associated with a histogram indicating the spatial distribution of points (bottom map).
  • Figure 3: Overlap (%) between target matrices and normalized matrices. Legend: $\;\bullet\!$ bi $\;\bullet\!$ row $\;\bullet\!$ col $\;\bullet\!$ all
  • Figure 4: One instance (seed 0) on Fashion-MNIST with $\alpha = 0.3$. Bi-normalization best recovers class relationships under distribution shifts. Diagonal variations are shown in orange; off-diagonal in blue, with darker shades indicating higher values.
  • Figure 5: Overlap (%) between target matrices and normalized matrices. Legend: $\;\bullet\!$ bi $\;\bullet\!$ row $\;\bullet\!$ col $\;\bullet\!$ all
  • ...and 1 more figures

Theorems & Definitions (5)

  • Definition 1
  • Proposition 1
  • Proposition 2
  • Lemma 1
  • Lemma 2