Benchmarking Segmentation Models with Mask-Preserved Attribute Editing
Zijin Yin, Kongming Liang, Bing Li, Zhanyu Ma, Jun Guo
TL;DR
This work addresses the need to evaluate segmentation robustness to both local and global attribute variations by introducing a mask-preserved attribute editing pipeline that edits attributes in real images while keeping segmentation masks intact. The authors build Pascal-EA, a robustness benchmark derived from Pascal VOC, encompassing object-level attributes (color, material, pattern) and image-level styles (snow, painting, sketch). A diffusion-based editing framework with mask-guided attention and ControlNet yields edited images that preserve layout, enabling reliable evaluation of diverse segmentation models, including open-vocabulary approaches. Key findings reveal that local attribute changes are as influential as global style shifts, with material alterations posing the strongest impacts; moreover, stronger models do not automatically exhibit greater robustness, underscoring the need for attribute-aware evaluation and robustness improvements.
Abstract
When deploying segmentation models in practice, it is critical to evaluate their behaviors in varied and complex scenes. Different from the previous evaluation paradigms only in consideration of global attribute variations (e.g. adverse weather), we investigate both local and global attribute variations for robustness evaluation. To achieve this, we construct a mask-preserved attribute editing pipeline to edit visual attributes of real images with precise control of structural information. Therefore, the original segmentation labels can be reused for the edited images. Using our pipeline, we construct a benchmark covering both object and image attributes (e.g. color, material, pattern, style). We evaluate a broad variety of semantic segmentation models, spanning from conventional close-set models to recent open-vocabulary large models on their robustness to different types of variations. We find that both local and global attribute variations affect segmentation performances, and the sensitivity of models diverges across different variation types. We argue that local attributes have the same importance as global attributes, and should be considered in the robustness evaluation of segmentation models. Code: https://github.com/PRIS-CV/Pascal-EA.
