Interpreting CLIP: Insights on the Robustness to ImageNet Distribution Shifts

Jonathan Crabbé; Pau Rodríguez; Vaishaal Shankar; Luca Zappella; Arno Blaas

Interpreting CLIP: Insights on the Robustness to ImageNet Distribution Shifts

Jonathan Crabbé, Pau Rodríguez, Vaishaal Shankar, Luca Zappella, Arno Blaas

TL;DR

The presence of outlier features in robust zero-shot CLIP vision encoders is detected and the number of unique encoded concepts in the representation space is investigated to find zero-shot CLIP models to encode a higher number of unique concepts in their representation space.

Abstract

What distinguishes robust models from non-robust ones? While for ImageNet distribution shifts it has been shown that such differences in robustness can be traced back predominantly to differences in training data, so far it is not known what that translates to in terms of what the model has learned. In this work, we bridge this gap by probing the representation spaces of 16 robust zero-shot CLIP vision encoders with various backbones (ResNets and ViTs) and pretraining sets (OpenAI, LAION-400M, LAION-2B, YFCC15M, CC12M and {DataComp}), and comparing them to the representation spaces of less robust models with identical backbones, but different (pre)training sets or objectives (CLIP pretraining on ImageNet-Captions, and supervised training or finetuning on ImageNet).Through this analysis, we generate three novel insights. Firstly, we detect the presence of outlier features in robust zero-shot CLIP vision encoders, which to the best of our knowledge is the first time these are observed in non-language and non-transformer models. Secondly, we find the existence of outlier features to be an indication of ImageNet shift robustness in models, since we only find them in robust models in our analysis. Lastly, we also investigate the number of unique encoded concepts in the representation space and find zero-shot CLIP models to encode a higher number of unique concepts in their representation space. However, we do not find this to be an indicator of ImageNet shift robustness and hypothesize that it is rather related to the language supervision. Since the presence of outlier features can be detected without access to any data from shifted datasets, we believe that they could be a useful tool for practitioners to get a feeling for the distribution shift robustness of a pretrained model during deployment.

Interpreting CLIP: Insights on the Robustness to ImageNet Distribution Shifts

TL;DR

Abstract

Paper Structure (41 sections, 9 equations, 22 figures, 7 tables)

This paper contains 41 sections, 9 equations, 22 figures, 7 tables.

Introduction
Background on CLIP models and notation
CLIP architecture.
Building zero-shot classifiers.
Robustness of CLIP models
Measuring robustness.
Model pool.
Results.
Outlier features and privileged directions
Approach.
Preliminary observations.
Activation kurtosis.
Privileged directions in representation space.
Emergence of outlier features.
Relationship of outlier features to pruning.
...and 26 more sections

Figures (22)

Figure 1: Robustness metrics for the models in our pool. Higher values indicate higher robustness. We see that for each backbone and pretraining data, robustness decreases from Robust zero-shot CLIP to Fine-tuned CLIP and reaches a minimum for ImageNet supervised ('Supervised') and Imagenet-Captions trained ('Non-robust zero-shot CLIP') models.
Figure 2: Outlier features and privileged directions. Most robust zero-shot CLIP models have outlier features (high kurtosis) and privileged directions (high direction importance outlierness). Fine-tuned models have no outlier features but still exhibit privileged directions, although those are noticeably less privileged. Supervised models and non-robust zero-shot CLIP models have no outlier features. The full distribution of importance scores can be found in \ref{['subsec:APP_privdirection']}
Figure 3: Results of the unique concept analysis, showing total number of unique Broden concepts encoded in last layers, as in \ref{['equ:NUniqueConcept']}. Zero-shot CLIP models encode substantially more concepts than supervised models.
Figure 4: Overlap between the concepts encoded in the representation space of different models for each OpenAI models. Zero-shot models encode many concepts not encoded other models.
Figure 5: Overlap of the finetuned model concepts with zero-shot and supervised models during finetuning, normalized at each epoch by the number of finetuned concepts $|\mathcal{C}_{\mathrm{fine}}|$. The overlap with the zero-shot-only (i.e. not overlapping with supervised) concepts decreases (blue curve), while concepts shared with zero-shot and supervised models increases (green curve).
...and 17 more figures

Interpreting CLIP: Insights on the Robustness to ImageNet Distribution Shifts

TL;DR

Abstract

Interpreting CLIP: Insights on the Robustness to ImageNet Distribution Shifts

Authors

TL;DR

Abstract

Table of Contents

Figures (22)