Renovating Names in Open-Vocabulary Segmentation Benchmarks

Haiwen Huang; Songyou Peng; Dan Zhang; Andreas Geiger

Renovating Names in Open-Vocabulary Segmentation Benchmarks

Haiwen Huang, Songyou Peng, Dan Zhang, Andreas Geiger

TL;DR

RENOVATE tackles the naming misalignment in open-vocabulary segmentation by automatically generating context-rich candidate names and learning a per-segment renaming model that aligns visual segments with refined text labels. The approach uses context-noun augmentation and GPT-4-based candidate generation, a CLIP-enabled transformer decoder with feedforward attention biased by ground-truth masks, and negative sampling to produce high-quality, segment-level names. Empirical results show RENOVATE improves open-vocabulary generalization (up to ~4 PQ and ~5 mIoU gains) and data efficiency, while enabling fine-grained evaluation via semantic-name similarity metrics that reveal benign misclassifications and model biases. The work demonstrates practical benefits for relabeling datasets like COCO, ADE20K, and Cityscapes and provides resources for improved benchmarking and dataset curation in vision-language segmentation.

Abstract

Names are essential to both human cognition and vision-language models. Open-vocabulary models utilize class names as text prompts to generalize to categories unseen during training. However, the precision of these names is often overlooked in existing datasets. In this paper, we address this underexplored problem by presenting a framework for "renovating" names in open-vocabulary segmentation benchmarks (RENOVATE). Our framework features a renaming model that enhances the quality of names for each visual segment. Through experiments, we demonstrate that our renovated names help train stronger open-vocabulary models with up to 15% relative improvement and significantly enhance training efficiency with improved data quality. We also show that our renovated names improve evaluation by better measuring misclassification and enabling fine-grained model analysis. We will provide our code and relabelings for several popular segmentation datasets (MS COCO, ADE20K, Cityscapes) to the research community.

Renovating Names in Open-Vocabulary Segmentation Benchmarks

TL;DR

Abstract

Paper Structure (29 sections, 3 equations, 19 figures, 7 tables)

This paper contains 29 sections, 3 equations, 19 figures, 7 tables.

Introduction
Related Work
RENOVATE: Renaming Segmentation Benchmarks
Generating candidate names
Training for candidate name selection
Obtaining renovated names
Applications of RENOVATE Names
Training with RENOVATE names
Improving evaluation with RENOVATE names
Experiments
Obtaining renovated names
Training with renovated names
Improving evaluation with renovated names
Conclusion and Limitations
More Literature Review
...and 14 more sections

Figures (19)

Figure 1: Problems of names in current segmentation benchmarks. We demonstrate examples from well-known datasets: MS COCO Lin2014ECCV, ADE20K Zhou2017CVPRb, and Cityscapes cordts2016cityscapes. Our renovated names are visually more aligned and help models to generalize better.
Figure 2: Overview of candidate name generation and renaming model training. We generate candidate names based on the context names and train the renaming model to match them with the segments. For illustration clarity, we show only one segment. In practice, multiple segments are jointly trained, pairing with the text queries.
Figure 3: Obtaining renovated names. In (a) we illustrate how we use the renaming model to obtain a renovated name for each segment. In (b) we demonstrate that the renaming results are helpful to dataset analysis with examples from "person" class.
Figure 4: Examples of renovated names on segments from the validation sets of ADE20K and Cityscapes. For each segment, we show the original name below the image and the renovated name in the text box. See more visual results in the supplements.
Figure 5: MS COCO $\rightarrow$ ADE20K.
...and 14 more figures

Renovating Names in Open-Vocabulary Segmentation Benchmarks

TL;DR

Abstract

Renovating Names in Open-Vocabulary Segmentation Benchmarks

Authors

TL;DR

Abstract

Table of Contents

Figures (19)