Table of Contents
Fetching ...

Towards Object Segmentation Mask Selection Using Specular Reflections

Katja Kossira, Yunxuan Zhu, Jürgen Seiler, André Kaup

TL;DR

This work identifies the largest region containing the reflection as the object and derives a more accurate object mask without requiring specialized training data or model adaption, against established and state-of-the-art techniques including Otsu thresholding, YOLO, and SAM2.

Abstract

Specular reflections pose a significant challenge for object segmentation, as their sharp intensity transitions often mislead both conventional algorithms and deep learning based methods. However, as the specular reflection must lie on the surface of the object, this fact can be exploited to improve the segmentation masks. By identifying the largest region containing the reflection as the object, we derive a more accurate object mask without requiring specialized training data or model adaption. We evaluate our method on both synthetic and real world images and compare it against established and state-of-the-art techniques including Otsu thresholding, YOLO, and SAM2. Compared to the best performing baseline SAM2, our approach achieves up to 26.7% improvement in IoU, 22.3% in DSC, and 9.7% in pixel accuracy. Qualitative evaluations on real world images further confirm the robustness and generalizability of the proposed approach.

Towards Object Segmentation Mask Selection Using Specular Reflections

TL;DR

This work identifies the largest region containing the reflection as the object and derives a more accurate object mask without requiring specialized training data or model adaption, against established and state-of-the-art techniques including Otsu thresholding, YOLO, and SAM2.

Abstract

Specular reflections pose a significant challenge for object segmentation, as their sharp intensity transitions often mislead both conventional algorithms and deep learning based methods. However, as the specular reflection must lie on the surface of the object, this fact can be exploited to improve the segmentation masks. By identifying the largest region containing the reflection as the object, we derive a more accurate object mask without requiring specialized training data or model adaption. We evaluate our method on both synthetic and real world images and compare it against established and state-of-the-art techniques including Otsu thresholding, YOLO, and SAM2. Compared to the best performing baseline SAM2, our approach achieves up to 26.7% improvement in IoU, 22.3% in DSC, and 9.7% in pixel accuracy. Qualitative evaluations on real world images further confirm the robustness and generalizability of the proposed approach.
Paper Structure (4 sections, 6 equations, 5 figures, 1 table)

This paper contains 4 sections, 6 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Original image and the object masks generated using Otsu Otsu, Mask R-CNN MaskRCNN, YOLO YOLO, SAM2 SAM2 as well as our proposed method. The white regions represent the calculated masks.
  • Figure 2: Overview of the proposed pipeline for object mask selection using specular reflections. Starting from the input image $\mathcal{I}$, specular reflection detection identifies candidate regions, which are then processed by SAM2 SAM2 to generate multiple segmentation masks $\mathcal{M}_1$, $\mathcal{M}_2$, $\mathcal{M}_3$. The mask selector evaluates each candidate using the white-pixel ratio $R_i$ and discards masks above a predefined threshold $R_{\text{max}}$. Among the remaining candidates, the mask with the highest $R_i$ is selected as $\mathcal{M}_\text{selected}$. Finally, the post-processing step refines this mask through connected component analysis and mask inversion to obtain the final segmentation result $\mathcal{M}$.
  • Figure 3: Examples of the specular reflection detection based on SRD1. The upper row shows the original image, while the bottom row displays the corresponding mask $\Omega$, which captures only the core specular highlights and ignores the surrounding intensity falloff.
  • Figure 4: Original image (left), masks $\mathcal{M}_1$ - $\mathcal{M}_3$ generated by SAM2 for a given input specular reflection center $[c_x, c_y]$ (middle) and the final mask $\mathcal{M}$ after post processing (right). The best candidate mask of $\mathcal{M}_1$ - $\mathcal{M}_3$ is selected automatically by the mask selector.
  • Figure 5: Qualitative evaluation of the segmentation masks on synthetic and real world images using Otsu Otsu, YOLO YOLO, SAM2 SAM2 and our method RePoSeg. The examples shown here are representative, the complete setup was conducted on a substantially larger dataset.