On the Effect of Image Resolution on Semantic Segmentation

Ritambhara Singh; Abhishek Jain; Pietro Perona; Shivani Agarwal; Junfeng Yang

On the Effect of Image Resolution on Semantic Segmentation

Ritambhara Singh, Abhishek Jain, Pietro Perona, Shivani Agarwal, Junfeng Yang

TL;DR

This work challenges the standard practice of downsampling for semantic segmentation by presenting a streamlined, high-resolution architecture that outputs outputs directly at native image resolution. It combines a ResNet-like classifier network with a multi-scale segmentation head that uses Bottom-Up Propagation to fuse coarse and fine features, improving detail preservation without excessive computation. The approach is validated across multiple datasets, with Cityscapes gains further amplified by Noisy Student Training, highlighting the value of semi-supervised learning in high-resolution segmentation. The results suggest that carefully designed multi-scale fusion can achieve state-of-the-art performance without the heavy computational burden of traditional high-resolution models, with implications for real-time, detailed scene understanding in applications like autonomous driving.

Abstract

High-resolution semantic segmentation requires substantial computational resources. Traditional approaches in the field typically downscale the input images before processing and then upscale the low-resolution outputs back to their original dimensions. While this strategy effectively identifies broad regions, it often misses finer details. In this study, we demonstrate that a streamlined model capable of directly producing high-resolution segmentations can match the performance of more complex systems that generate lower-resolution results. By simplifying the network architecture, we enable the processing of images at their native resolution. Our approach leverages a bottom-up information propagation technique across various scales, which we have empirically shown to enhance segmentation accuracy. We have rigorously tested our method using leading-edge semantic segmentation datasets. Specifically, for the Cityscapes dataset, we further boost accuracy by applying the Noisy Student Training technique.

On the Effect of Image Resolution on Semantic Segmentation

TL;DR

Abstract

On the Effect of Image Resolution on Semantic Segmentation

Authors

TL;DR

Abstract

Table of Contents

Figures (2)