Inverting Visual Representations with Convolutional Networks

Alexey Dosovitskiy; Thomas Brox

Inverting Visual Representations with Convolutional Networks

Alexey Dosovitskiy, Thomas Brox

TL;DR

The paper introduces an up-convolutional network framework to invert image representations, enabling reconstruction from both traditional descriptors (HOG, SIFT, LBP) and deep CNN features (AlexNet). By learning the conditional expectation of images given feature vectors, the approach reveals what information is preserved and what is discarded by different representations, showing that colors and rough object layout can be recovered even from high-level activations and class probabilities. It also demonstrates that high-level reconstructions rely largely on activation patterns rather than precise magnitudes, and that the model learns a natural image prior that supports plausible colorization and structure from random features. The method is fast at test time and broadly applicable to arbitrary feature representations, offering new insights into the structure of visual representations and the role of invariances in CNNs.

Abstract

Feature representations, both hand-designed and learned ones, are often hard to analyze and interpret, even when they are extracted from visual data. We propose a new approach to study image representations by inverting them with an up-convolutional neural network. We apply the method to shallow representations (HOG, SIFT, LBP), as well as to deep networks. For shallow representations our approach provides significantly better reconstructions than existing methods, revealing that there is surprisingly rich information contained in these features. Inverting a deep network trained on ImageNet provides several insights into the properties of the feature representation learned by the network. Most strikingly, the colors and the rough contours of an image can be reconstructed from activations in higher network layers and even from the predicted class probabilities.

Inverting Visual Representations with Convolutional Networks

TL;DR

Abstract

Inverting Visual Representations with Convolutional Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (21)