On the Effect of Image Resolution on Semantic Segmentation
Ritambhara Singh, Abhishek Jain, Pietro Perona, Shivani Agarwal, Junfeng Yang
TL;DR
This work challenges the standard practice of downsampling for semantic segmentation by presenting a streamlined, high-resolution architecture that outputs outputs directly at native image resolution. It combines a ResNet-like classifier network with a multi-scale segmentation head that uses Bottom-Up Propagation to fuse coarse and fine features, improving detail preservation without excessive computation. The approach is validated across multiple datasets, with Cityscapes gains further amplified by Noisy Student Training, highlighting the value of semi-supervised learning in high-resolution segmentation. The results suggest that carefully designed multi-scale fusion can achieve state-of-the-art performance without the heavy computational burden of traditional high-resolution models, with implications for real-time, detailed scene understanding in applications like autonomous driving.
Abstract
High-resolution semantic segmentation requires substantial computational resources. Traditional approaches in the field typically downscale the input images before processing and then upscale the low-resolution outputs back to their original dimensions. While this strategy effectively identifies broad regions, it often misses finer details. In this study, we demonstrate that a streamlined model capable of directly producing high-resolution segmentations can match the performance of more complex systems that generate lower-resolution results. By simplifying the network architecture, we enable the processing of images at their native resolution. Our approach leverages a bottom-up information propagation technique across various scales, which we have empirically shown to enhance segmentation accuracy. We have rigorously tested our method using leading-edge semantic segmentation datasets. Specifically, for the Cityscapes dataset, we further boost accuracy by applying the Noisy Student Training technique.
