Table of Contents
Fetching ...

Semantic Depth Matters: Explaining Errors of Deep Vision Networks through Perceived Class Similarities

Katarzyna Filus, Michał Romaszewski, Mateusz Żarski

TL;DR

Understanding misclassifications in deep vision networks required beyond accuracy; this work links error patterns to semantic similarities via a data-free framework. The authors propose Similarity Depth ($SD$) to quantify a network’s perceived semantic depth and introduce Class Similarity Graphs (SSG, NSSG, NFSG) plus Similarity Graph Compliance (SGC) and Visual Similarity Graph Explanation to relate internal representations to WordNet semantics. The method is data-free and applicable to pretrained models using only classifier weights, enabling rapid analysis without additional data. Experiments on Mini-ImageNet and CIFAR-100 show that higher $SD$ often aligns with greater explainability of errors (via perceived similarities), and clustering boosts $SD$, with visualizations aiding debugging and model refinement.

Abstract

Understanding deep neural network (DNN) behavior requires more than evaluating classification accuracy alone; analyzing errors and their predictability is equally crucial. Current evaluation methodologies lack transparency, particularly in explaining the underlying causes of network misclassifications. To address this, we introduce a novel framework that investigates the relationship between the semantic hierarchy depth perceived by a network and its real-data misclassification patterns. Central to our framework is the Similarity Depth (SD) metric, which quantifies the semantic hierarchy depth perceived by a network along with a method of evaluation of how closely the network's errors align with its internally perceived similarity structure. We also propose a graph-based visualization of model semantic relationships and misperceptions. A key advantage of our approach is that leveraging class templates -- representations derived from classifier layer weights -- is applicable to already trained networks without requiring additional data or experiments. Our approach reveals that deep vision networks encode specific semantic hierarchies and that high semantic depth improves the compliance between perceived class similarities and actual errors.

Semantic Depth Matters: Explaining Errors of Deep Vision Networks through Perceived Class Similarities

TL;DR

Understanding misclassifications in deep vision networks required beyond accuracy; this work links error patterns to semantic similarities via a data-free framework. The authors propose Similarity Depth () to quantify a network’s perceived semantic depth and introduce Class Similarity Graphs (SSG, NSSG, NFSG) plus Similarity Graph Compliance (SGC) and Visual Similarity Graph Explanation to relate internal representations to WordNet semantics. The method is data-free and applicable to pretrained models using only classifier weights, enabling rapid analysis without additional data. Experiments on Mini-ImageNet and CIFAR-100 show that higher often aligns with greater explainability of errors (via perceived similarities), and clustering boosts , with visualizations aiding debugging and model refinement.

Abstract

Understanding deep neural network (DNN) behavior requires more than evaluating classification accuracy alone; analyzing errors and their predictability is equally crucial. Current evaluation methodologies lack transparency, particularly in explaining the underlying causes of network misclassifications. To address this, we introduce a novel framework that investigates the relationship between the semantic hierarchy depth perceived by a network and its real-data misclassification patterns. Central to our framework is the Similarity Depth (SD) metric, which quantifies the semantic hierarchy depth perceived by a network along with a method of evaluation of how closely the network's errors align with its internally perceived similarity structure. We also propose a graph-based visualization of model semantic relationships and misperceptions. A key advantage of our approach is that leveraging class templates -- representations derived from classifier layer weights -- is applicable to already trained networks without requiring additional data or experiments. Our approach reveals that deep vision networks encode specific semantic hierarchies and that high semantic depth improves the compliance between perceived class similarities and actual errors.

Paper Structure

This paper contains 16 sections, 2 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Overview of the main pipeline of the proposed framework. Class Similarity Matrices are transformed into undirected similarity graphs. The graph is then partitioned to obtain subgraphs with strong inter-node relations. Maximum Spanning Tree is then determined for all the subgraphs. These structures can be used to compute our Semantic Depth or for visualization explanations.
  • Figure 2: Overview of the proposed quantitative and qualitative methods.
  • Figure 3: Different neighborhood types used in the graphs' compliance metrics.
  • Figure 4: Similarity Graph Compliance (SGC) as a function of SD: source - NFSG (functional), target - NSSG (structural). Mistakes we can explain with similarities.
  • Figure 5: Similarity Graph Compliance (SGC) as a function of SD: source - NSSG (structural), target - NFSG (functional). Perceived similarities causing confusions.
  • ...and 3 more figures