Table of Contents
Fetching ...

Recognizing Image Style

Sergey Karayev, Matthew Trentacoste, Helen Han, Aseem Agarwala, Trevor Darrell, Aaron Hertzmann, Holger Winnemoeller

TL;DR

Karayev et al. address visual style recognition by building two large-scale labeled datasets (Flickr Style and Wikipaintings) and evaluating multiple image features for style classification. They find that deep CNN features learned from object-class data transfer effectively to style tasks, achieving state-of-the-art results on Flickr Style, Wikipaintings, and AVA Style, often approaching human performance. The work demonstrates practical applications in style-constrained image search and argues that style is content-dependent, offering insights into why object-recognition features transfer to style. Data, predictors, and code are released to enable broader research.

Abstract

The style of an image plays a significant role in how it is viewed, but style has received little attention in computer vision research. We describe an approach to predicting style of images, and perform a thorough evaluation of different image features for these tasks. We find that features learned in a multi-layer network generally perform best -- even when trained with object class (not style) labels. Our large-scale learning methods results in the best published performance on an existing dataset of aesthetic ratings and photographic style annotations. We present two novel datasets: 80K Flickr photographs annotated with 20 curated style labels, and 85K paintings annotated with 25 style/genre labels. Our approach shows excellent classification performance on both datasets. We use the learned classifiers to extend traditional tag-based image search to consider stylistic constraints, and demonstrate cross-dataset understanding of style.

Recognizing Image Style

TL;DR

Karayev et al. address visual style recognition by building two large-scale labeled datasets (Flickr Style and Wikipaintings) and evaluating multiple image features for style classification. They find that deep CNN features learned from object-class data transfer effectively to style tasks, achieving state-of-the-art results on Flickr Style, Wikipaintings, and AVA Style, often approaching human performance. The work demonstrates practical applications in style-constrained image search and argues that style is content-dependent, offering insights into why object-recognition features transfer to style. Data, predictors, and code are released to enable broader research.

Abstract

The style of an image plays a significant role in how it is viewed, but style has received little attention in computer vision research. We describe an approach to predicting style of images, and perform a thorough evaluation of different image features for these tasks. We find that features learned in a multi-layer network generally perform best -- even when trained with object class (not style) labels. Our large-scale learning methods results in the best published performance on an existing dataset of aesthetic ratings and photographic style annotations. We present two novel datasets: 80K Flickr photographs annotated with 20 curated style labels, and 85K paintings annotated with 25 style/genre labels. Our approach shows excellent classification performance on both datasets. We use the learned classifiers to extend traditional tag-based image search to consider stylistic constraints, and demonstrate cross-dataset understanding of style.

Paper Structure

This paper contains 20 sections, 1 equation, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Typical images in different style categories of our datasets.
  • Figure 2: Correlation of PASCAL content classifier predictions (rows) against ground truth Flickr Style labels (columns). We see, for instance, that the Macro style is highly correlated with presence of animals, and that Long Exposure and Sunny style photographs often feature vehicles.
  • Figure 3: Top five most-confident positive predictions on the Flickr Style test set, for a few different styles.
  • Figure 4: Cross-dataset style. On the left are shown top scorers from the Wikipaintings set, for styles learned on the Flickr set. On the right, Flickr photographs are accordingly sorted by Painting style. (Figure best viewed in color.)
  • Figure 5: Example of filtering image search results by style. Our Flickr Style classifiers are applied to images found on Pinterest. The images are searched by the text contents of their captions, then filtered by the response of the style classifiers. Here we show three out of top five results for different query/style combinations.
  • ...and 4 more figures