Discriminating image representations with principal distortions

Jenelle Feather; David Lipshutz; Sarah E. Harvey; Alex H. Williams; Eero P. Simoncelli

Discriminating image representations with principal distortions

Jenelle Feather, David Lipshutz, Sarah E. Harvey, Alex H. Williams, Eero P. Simoncelli

TL;DR

The paper addresses the problem of distinguishing image representations when global geometry is similar but local geometry differs by proposing a Fisher information–based framework to quantify local sensitivities around a base image.It introduces a metric on local geometry, $m_{\mathbf{u},\mathbf{v}}(\mathbf{I}_A,\mathbf{I}_B)$, and defines 'principal distortions' that maximize cross-model variance across $N$ representations, extending beyond pairwise comparisons.The approach is validated on hand-crafted early-visual models and multiple deep neural networks, revealing architecture- and training-dependent differences in local sensitivity and showing efficient model differentiation with a small set of distortions.These principal distortions offer a tool for probing how model local geometries align with human perception and could inform interpretability, psychophysics experiments, and the analysis of robustness and texture-shape biases.

Abstract

Image representations (artificial or biological) are often compared in terms of their global geometric structure; however, representations with similar global structure can have strikingly different local geometries. Here, we propose a framework for comparing a set of image representations in terms of their local geometries. We quantify the local geometry of a representation using the Fisher information matrix, a standard statistical tool for characterizing the sensitivity to local stimulus distortions, and use this as a substrate for a metric on the local geometry in the vicinity of a base image. This metric may then be used to optimally differentiate a set of models, by finding a pair of "principal distortions" that maximize the variance of the models under this metric. As an example, we use this framework to compare a set of simple models of the early visual system, identifying a novel set of image distortions that allow immediate comparison of the models by visual inspection. In a second example, we apply our method to a set of deep neural network models and reveal differences in the local geometry that arise due to architecture and training types. These examples demonstrate how our framework can be used to probe for informative differences in local sensitivities between complex models, and suggest how it could be used to compare model representations with human perception.

Discriminating image representations with principal distortions

TL;DR

Abstract

Discriminating image representations with principal distortions

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)