Table of Contents
Fetching ...

MFP: Making Full Use of Probability Maps for Interactive Image Segmentation

Chaewon Lee, Seon-Ho Lee, Chang-Su Kim

TL;DR

The paper tackles the inefficiency in propagating information from previous probability maps in click-based interactive segmentation. It introduces MFP, which modulates the prior probability map $P^{t-1}$ to $ ilde{P}^{t-1}$ via gamma correction within a modulation window around the current click, and feeds this modulated map as extra input, with a late-fusion architecture that combines probability-related features ${\cal F}_P^{t}$ with backbone features ${\cal F}_B^{t}$. It defines distance-based gamma schemes using Euclidean distance $d=\,\|x-u\|$ or probability distance $d=(P^{t-1}_x-P^{t-1}_u)^2$, and employs a recursive training regime over sequences of up to 24 clicks. Across ResNet-34, HRNet-18, and ViT-B backbones, MFP achieves superior NoC and IoU/AUC on GrabCut, Berkeley, DAVIS, and SBD (with COCO+LVIS training), demonstrating strong generalization and practical impact; code is publicly available for replication.

Abstract

In recent interactive segmentation algorithms, previous probability maps are used as network input to help predictions in the current segmentation round. However, despite the utilization of previous masks, useful information contained in the probability maps is not well propagated to the current predictions. In this paper, to overcome this limitation, we propose a novel and effective algorithm for click-based interactive image segmentation, called MFP, which attempts to make full use of probability maps. We first modulate previous probability maps to enhance their representations of user-specified objects. Then, we feed the modulated probability maps as additional input to the segmentation network. We implement the proposed MFP algorithm based on the ResNet-34, HRNet-18, and ViT-B backbones and assess the performance extensively on various datasets. It is demonstrated that MFP meaningfully outperforms the existing algorithms using identical backbones. The source codes are available at https://github.com/cwlee00/MFP.

MFP: Making Full Use of Probability Maps for Interactive Image Segmentation

TL;DR

The paper tackles the inefficiency in propagating information from previous probability maps in click-based interactive segmentation. It introduces MFP, which modulates the prior probability map to via gamma correction within a modulation window around the current click, and feeds this modulated map as extra input, with a late-fusion architecture that combines probability-related features with backbone features . It defines distance-based gamma schemes using Euclidean distance or probability distance , and employs a recursive training regime over sequences of up to 24 clicks. Across ResNet-34, HRNet-18, and ViT-B backbones, MFP achieves superior NoC and IoU/AUC on GrabCut, Berkeley, DAVIS, and SBD (with COCO+LVIS training), demonstrating strong generalization and practical impact; code is publicly available for replication.

Abstract

In recent interactive segmentation algorithms, previous probability maps are used as network input to help predictions in the current segmentation round. However, despite the utilization of previous masks, useful information contained in the probability maps is not well propagated to the current predictions. In this paper, to overcome this limitation, we propose a novel and effective algorithm for click-based interactive image segmentation, called MFP, which attempts to make full use of probability maps. We first modulate previous probability maps to enhance their representations of user-specified objects. Then, we feed the modulated probability maps as additional input to the segmentation network. We implement the proposed MFP algorithm based on the ResNet-34, HRNet-18, and ViT-B backbones and assess the performance extensively on various datasets. It is demonstrated that MFP meaningfully outperforms the existing algorithms using identical backbones. The source codes are available at https://github.com/cwlee00/MFP.
Paper Structure (13 sections, 6 equations, 7 figures, 5 tables)

This paper contains 13 sections, 6 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Utilization of previous probability information in the current segmentation round: The conventional click-based interactive segmentation algorithm SimpleClick liu2023simpleclick fails to capture the details contained in the previous probability maps. On the other hand, the proposed algorithm exploits the shape details in the previous probability maps to yield a better segmentation result in the current round.
  • Figure 2: (a) Existing algorithms take an input image $I$, click map $C^{t}$, and previous probability map $P^{t-1}$ as input to the segmentation network. From these input signals, they extract feature ${\cal F}_B^{t}$ and directly feed it into the segmentation head to obtain the current probability map $P^{t}$. Then, $P^{t}$ is thresholded to the final object mask $Y^{t}$. (b) In contrast, the proposed MFP algorithm modulates $P^{t-1}$ into $\tilde{P}^{t-1}$ and takes it as additional input to the network. Furthermore, MFP late-fuses probability-related feature ${\cal F}_P^{t}$ with backbone feature ${\cal F}_B^{t}$ before the segmentation head.
  • Figure 3: Overview of the proposed MFP algorithm. In the click map, foreground and background clicks are depicted in green and red, respectively.
  • Figure 4: Illustration of the gamma correction for the probability modulation: (a) probabilities near a foreground click are increased with power $\frac{1}{\gamma}$ smaller than 1, and (b) those near a background click are decreased with power $\gamma$ bigger than 1.
  • Figure 5: Examples of the probability map modulation via \ref{['eq:gamma_intensity']}. From top to bottom: previous probability maps ${P}^{t-1}$ before modulation with current clicks, modulated probability maps $\tilde{P}^{t-1}$, and ground-truth masks of target objects.
  • ...and 2 more figures