Make it SING: Analyzing Semantic Invariants in Classifiers

Harel Yadid; Meir Yossef Levi; Roy Betser; Guy Gilboa

Make it SING: Analyzing Semantic Invariants in Classifiers

Harel Yadid, Meir Yossef Levi, Roy Betser, Guy Gilboa

Abstract

All classifiers, including state-of-the-art vision models, possess invariants, partially rooted in the geometry of their linear mappings. These invariants, which reside in the null-space of the classifier, induce equivalent sets of inputs that map to identical outputs. The semantic content of these invariants remains vague, as existing approaches struggle to provide human-interpretable information. To address this gap, we present Semantic Interpretation of the Null-space Geometry (SING), a method that constructs equivalent images, with respect to the network, and assigns semantic interpretations to the available variations. We use a mapping from network features to multi-modal vision language models. This allows us to obtain natural language descriptions and visual examples of the induced semantic shifts. SING can be applied to a single image, uncovering local invariants, or to sets of images, allowing a breadth of statistical analysis at the class and model levels. For example, our method reveals that ResNet50 leaks relevant semantic attributes to the null space, whereas DinoViT, a ViT pretrained with self-supervised DINO, is superior in maintaining class semantics across the invariant space.

Make it SING: Analyzing Semantic Invariants in Classifiers

Abstract

Paper Structure (31 sections, 17 equations, 16 figures, 3 tables)

This paper contains 31 sections, 17 equations, 16 figures, 3 tables.

Introduction
Related Work
Explainability through decomposition
Projecting features to a vision-language space
Method
Setup
SVD on the classifier head
Training a translator
Metrics
Attribute score.
Image score.
Applications
Model-level comparison.
Class and Attribute analysis.
Single image analysis.
...and 16 more sections

Figures (16)

Figure 1: Visualization of benign and problematic invariants. The four images at the center correspond to certain features taken from a pretrained ResNet50. On the left and right columns their equivalent images are shown, following null-space removal. Each pair yields the same logits after passing through the linear head. The left side (green) demonstrates robustness, with little semantic change. The right side (red) incurs large semantic deviations. Our framework quantifies these changes statistically, diagnosing semantic invariants at the class and network level.
Figure 2: Method Overview. The approach consists of: (a) decomposing the final linear weights to obtain principal and null projectors; (b) training a translator that maps features from the network embedding space to the CLIP image space; (c) creating an equivalent pair to the feature we want to examine. (d) translate the set into CLIP image embedding space, and apply our metrics and visualizations.
Figure 3: Model-level comparison (1,000 classes). (a) Attribute Score (AS) quantifies class-dependent semantic leakage into the null space; Image Score (IS) quantifies tolerance to class-independent (non--class-dependent) semantic variation within the invariant subspace. Desirably, AS is low and IS is high (relative to AS). In our results, DinoViT performs best in this regard. (b) We summarize the trade-off with the $\mathrm{IS}/\mathrm{AS}$ ratio (higher is better), DinoViT has the highest ratio and ResNext101 the lowest.
Figure 4: Class Comparison. DinoViT consistently preserves low semantic leakage across classes, whereas ResNet50 exhibits a pronounced imbalance, with certain classes, such as Porcupine and Sports-Car, leaking substantially more semantic information into the null space.
Figure 5: Open-vocabulary concept analysis. For DinoViT, we sample $\,\sim\!1300$ images per class and compute the CLIP angle (degrees; lower is more similar) to a set of concepts for (a) "Arabian Camel" class and (b) "Jellyfish" class. Blue dots denote original features; red dots denote null-removed (equivalent) features. Green arrows connect each pair and represent the Attribute Score after null removal. Longer arrows indicate larger $|\mathrm{AS}|$ (greater class-dependent semantic leakage); shorter arrows indicate minimal leakage.
...and 11 more figures

Make it SING: Analyzing Semantic Invariants in Classifiers

Abstract

Make it SING: Analyzing Semantic Invariants in Classifiers

Authors

Abstract

Table of Contents

Figures (16)