Soft labelling for semantic segmentation: Bringing coherence to label down-sampling
Roberto Alcover-Couso, Marcos Escudero-Vinolo, Juan C. SanMiguel, Jose M. Martinez
TL;DR
The paper tackles the problem of down-sampling in semantic segmentation causing misalignment between colour and label information, especially at high down-sampling factors. It introduces soft-labels for label down-sampling and pairs them with the colour sampling to preserve information and align distributions, formalized through one-hot label encodings and region-based soft-label computation. The authors present extensive experiments across Cityscapes, Mapillary, and ADE20K showing that paired soft-label down-sampling yields higher mean IoU than standard baselines while using far fewer resources, often matching or exceeding state-of-the-art results on constrained hardware. This approach enables competitive semantic segmentation in budget-constrained settings and opens pathways for further improvements via soft-colour encoding and broader dataset evaluation.
Abstract
In semantic segmentation, training data down-sampling is commonly performed due to limited resources, the need to adapt image size to the model input, or improve data augmentation. This down-sampling typically employs different strategies for the image data and the annotated labels. Such discrepancy leads to mismatches between the down-sampled color and label images. Hence, the training performance significantly decreases as the down-sampling factor increases. In this paper, we bring together the down-sampling strategies for the image data and the training labels. To that aim, we propose a novel framework for label down-sampling via soft-labeling that better conserves label information after down-sampling. Therefore, fully aligning soft-labels with image data to keep the distribution of the sampled pixels. This proposal also produces reliable annotations for under-represented semantic classes. Altogether, it allows training competitive models at lower resolutions. Experiments show that the proposal outperforms other down-sampling strategies. Moreover, state-of-the-art performance is achieved for reference benchmarks, but employing significantly less computational resources than foremost approaches. This proposal enables competitive research for semantic segmentation under resource constraints.
