Table of Contents
Fetching ...

RClicks: Realistic Click Simulation for Benchmarking Interactive Segmentation

Anton Antonov, Andrey Moskalenko, Denis Shepelev, Alexander Krapukhin, Konstantin Soshin, Anton Konushin, Vlad Shakhuro

TL;DR

A clickability model is developed that enables sampling clicks, which closely resemble actual user inputs and is believed to be a significant step towards creating interactive segmentation methods that provide the best user experience in real-world cases.

Abstract

The emergence of Segment Anything (SAM) sparked research interest in the field of interactive segmentation, especially in the context of image editing tasks and speeding up data annotation. Unlike common semantic segmentation, interactive segmentation methods allow users to directly influence their output through prompts (e.g. clicks). However, click patterns in real-world interactive segmentation scenarios remain largely unexplored. Most methods rely on the assumption that users would click in the center of the largest erroneous area. Nevertheless, recent studies show that this is not always the case. Thus, methods may have poor performance in real-world deployment despite high metrics in a baseline benchmark. To accurately simulate real-user clicks, we conducted a large crowdsourcing study of click patterns in an interactive segmentation scenario and collected 475K real-user clicks. Drawing on ideas from saliency tasks, we develop a clickability model that enables sampling clicks, which closely resemble actual user inputs. Using our model and dataset, we propose RClicks benchmark for a comprehensive comparison of existing interactive segmentation methods on realistic clicks. Specifically, we evaluate not only the average quality of methods, but also the robustness w.r.t. click patterns. According to our benchmark, in real-world usage interactive segmentation models may perform worse than it has been reported in the baseline benchmark, and most of the methods are not robust. We believe that RClicks is a significant step towards creating interactive segmentation methods that provide the best user experience in real-world cases.

RClicks: Realistic Click Simulation for Benchmarking Interactive Segmentation

TL;DR

A clickability model is developed that enables sampling clicks, which closely resemble actual user inputs and is believed to be a significant step towards creating interactive segmentation methods that provide the best user experience in real-world cases.

Abstract

The emergence of Segment Anything (SAM) sparked research interest in the field of interactive segmentation, especially in the context of image editing tasks and speeding up data annotation. Unlike common semantic segmentation, interactive segmentation methods allow users to directly influence their output through prompts (e.g. clicks). However, click patterns in real-world interactive segmentation scenarios remain largely unexplored. Most methods rely on the assumption that users would click in the center of the largest erroneous area. Nevertheless, recent studies show that this is not always the case. Thus, methods may have poor performance in real-world deployment despite high metrics in a baseline benchmark. To accurately simulate real-user clicks, we conducted a large crowdsourcing study of click patterns in an interactive segmentation scenario and collected 475K real-user clicks. Drawing on ideas from saliency tasks, we develop a clickability model that enables sampling clicks, which closely resemble actual user inputs. Using our model and dataset, we propose RClicks benchmark for a comprehensive comparison of existing interactive segmentation methods on realistic clicks. Specifically, we evaluate not only the average quality of methods, but also the robustness w.r.t. click patterns. According to our benchmark, in real-world usage interactive segmentation models may perform worse than it has been reported in the baseline benchmark, and most of the methods are not robust. We believe that RClicks is a significant step towards creating interactive segmentation methods that provide the best user experience in real-world cases.

Paper Structure

This paper contains 38 sections, 15 figures, 12 tables.

Figures (15)

  • Figure 1: Examples of real and predicted users' clicks of interactive segmentation task. The upper row depicts real-users clicks (green) for a given target object (white contour); the middle and bottom rows visualize, correspondingly, clicks and their distribution predicted by our clickability model. Purple points in the middle and bottom rows represent clicks generated by the baseline strategyritm. Mostly baseline click is close to a mode of users' distribution (see \ref{['fig:teaser:berkeley']} and \ref{['fig:teaser:tetris']}), however, in some cases it may be far from the mode (e.g. \ref{['fig:teaser:grabcut']}, \ref{['fig:teaser:davis']}) or may not represent all modes of the distribution (e.g. \ref{['fig:teaser:coco']}, \ref{['fig:teaser:tetris']}).
  • Figure 2: Illustration of the tested display modes to reduce presentation bias. The best result was obtained with the Object CutOut mode, where an object is presented on a gray background without shifts.
  • Figure 3: Examples of situations where instructing participants with text descriptions may be challenging or ambiguous: selection of a certain instance in the first round (\ref{['fig:hard_clicks_descriptions:1']}-\ref{['fig:hard_clicks_descriptions:2']}); and selecting or unselecting a certain error area in the subsequent round (\ref{['fig:hard_clicks_descriptions:fp']}-\ref{['fig:hard_clicks_descriptions:fn']}).
  • Figure 4: Examples of considered clickability models: \ref{['fig:click_models:fixations']} visualizes target object (white contour) and ground-truth clicks (green points); \ref{['fig:click_models:uniform']} -- \ref{['fig:click_models:saliency']} depict uniform distribution (UD), distance transform (DT), and saliency map (SM) respectively; \ref{['fig:click_models:ours']} -- our predicted clickability map.
  • Figure 5: Proposed clickability prediction pipeline.
  • ...and 10 more figures