Recognizing Image Style
Sergey Karayev, Matthew Trentacoste, Helen Han, Aseem Agarwala, Trevor Darrell, Aaron Hertzmann, Holger Winnemoeller
TL;DR
Karayev et al. address visual style recognition by building two large-scale labeled datasets (Flickr Style and Wikipaintings) and evaluating multiple image features for style classification. They find that deep CNN features learned from object-class data transfer effectively to style tasks, achieving state-of-the-art results on Flickr Style, Wikipaintings, and AVA Style, often approaching human performance. The work demonstrates practical applications in style-constrained image search and argues that style is content-dependent, offering insights into why object-recognition features transfer to style. Data, predictors, and code are released to enable broader research.
Abstract
The style of an image plays a significant role in how it is viewed, but style has received little attention in computer vision research. We describe an approach to predicting style of images, and perform a thorough evaluation of different image features for these tasks. We find that features learned in a multi-layer network generally perform best -- even when trained with object class (not style) labels. Our large-scale learning methods results in the best published performance on an existing dataset of aesthetic ratings and photographic style annotations. We present two novel datasets: 80K Flickr photographs annotated with 20 curated style labels, and 85K paintings annotated with 25 style/genre labels. Our approach shows excellent classification performance on both datasets. We use the learned classifiers to extend traditional tag-based image search to consider stylistic constraints, and demonstrate cross-dataset understanding of style.
