Table of Contents
Fetching ...

Do Saliency Models Detect Odd-One-Out Targets? New Datasets and Evaluations

Iuliia Kotseruba, Calden Wloka, Amir Rasouli, John K. Tsotsos

TL;DR

This work investigates singleton detection, which can be thought of as a canonical example of salience, and demonstrates that nearly all saliency algorithms do not adequately respond to singleton targets in synthetic and natural images.

Abstract

Recent advances in the field of saliency have concentrated on fixation prediction, with benchmarks reaching saturation. However, there is an extensive body of works in psychology and neuroscience that describe aspects of human visual attention that might not be adequately captured by current approaches. Here, we investigate singleton detection, which can be thought of as a canonical example of salience. We introduce two novel datasets, one with psychophysical patterns and one with natural odd-one-out stimuli. Using these datasets we demonstrate through extensive experimentation that nearly all saliency algorithms do not adequately respond to singleton targets in synthetic and natural images. Furthermore, we investigate the effect of training state-of-the-art CNN-based saliency models on these types of stimuli and conclude that the additional training data does not lead to a significant improvement of their ability to find odd-one-out targets. Datasets are available at http://data.nvision2.eecs.yorku.ca/P3O3/.

Do Saliency Models Detect Odd-One-Out Targets? New Datasets and Evaluations

TL;DR

This work investigates singleton detection, which can be thought of as a canonical example of salience, and demonstrates that nearly all saliency algorithms do not adequately respond to singleton targets in synthetic and natural images.

Abstract

Recent advances in the field of saliency have concentrated on fixation prediction, with benchmarks reaching saturation. However, there is an extensive body of works in psychology and neuroscience that describe aspects of human visual attention that might not be adequately captured by current approaches. Here, we investigate singleton detection, which can be thought of as a canonical example of salience. We introduce two novel datasets, one with psychophysical patterns and one with natural odd-one-out stimuli. Using these datasets we demonstrate through extensive experimentation that nearly all saliency algorithms do not adequately respond to singleton targets in synthetic and natural images. Furthermore, we investigate the effect of training state-of-the-art CNN-based saliency models on these types of stimuli and conclude that the additional training data does not lead to a significant improvement of their ability to find odd-one-out targets. Datasets are available at http://data.nvision2.eecs.yorku.ca/P3O3/.

Paper Structure

This paper contains 18 sections, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Sample images from P$^3$ with (a) color, (b) orientation and (c) size singletons.
  • Figure 2: Sample images from the O$^3$ dataset with singletons in various feature dimensions. From left to right: color, size, color/texture, shape, size, orientation.
  • Figure 3: a) Number of fixations vs % of targets detected. b) Performance on color, orientation and size singletons at maximum of 100 fixations. Models are sorted by the % of targets detected at maximum of 100 fixations. Labels for deep models are shown in bold.
  • Figure 4: The discriminative ability (GSI score) of the top-3 classical and deep models for a range of TD differences in color (a), orientation (b) and size (c) feature dimensions. The models are selected based on the total number of targets found within 100 fixations in each dimension. d) Sample saliency maps for each model. TD difference for color and orientation targets increases from top to bottom row. Size targets range from the smallest to largest.
  • Figure 5: a) The mean MSR$_{targ}$ of the top-3 classical and deep models for the color and non-color targets in O$^3$ (shown as blue and red dots respectively). Along the dashed line the performance of the models is equal. The red, yellow and green colors show areas where targets are not discriminated (MSR$_{targ} < 1$), somewhat discriminated ($1\leq$ MSR$_{targ}\leq 2$) and strongly discriminated (MSR$_{targ} > 2$). b) Sample images and corresponding saliency maps. From top to bottom row: hard for both, easy for both, classical models perform better, deep models perform better.
  • ...and 4 more figures