SA-MixNet: Structure-aware Mixup and Invariance Learning for Scribble-supervised Road Extraction in Remote Sensing Images
Jie Feng, Hao Huang, Junpeng Zhang, Weisheng Dong, Dingwen Zhang, Licheng Jiao
TL;DR
SA-MixNet targets the robustness gap in scribble-based road extraction by introducing a fully data-driven, structure-aware approach. It combines Statistic and Content-based Label Expansion, Structure-aware Mixup, and invariance plus connectivity regularizations to enforce consistent, topology-preserving predictions across varied scenes. Empirical results on DeepGlobe, Wuhan, and Massachusetts-road show consistent IoU gains over state-of-the-art weakly supervised and Mixup baselines, and the framework demonstrates plug-and-play compatibility with different extractors. This work advances practical road extraction under limited annotations by improving generalization, connectivity, and resilience to scene complexity.
Abstract
Mainstreamed weakly supervised road extractors rely on highly confident pseudo-labels propagated from scribbles, and their performance often degrades gradually as the image scenes tend various. We argue that such degradation is due to the poor model's invariance to scenes with different complexities, whereas existing solutions to this problem are commonly based on crafted priors that cannot be derived from scribbles. To eliminate the reliance on such priors, we propose a novel Structure-aware Mixup and Invariance Learning framework (SA-MixNet) for weakly supervised road extraction that improves the model invariance in a data-driven manner. Specifically, we design a structure-aware Mixup scheme to paste road regions from one image onto another for creating an image scene with increased complexity while preserving the road's structural integrity. Then an invariance regularization is imposed on the predictions of constructed and origin images to minimize their conflicts, which thus forces the model to behave consistently on various scenes. Moreover, a discriminator-based regularization is designed for enhancing the connectivity meanwhile preserving the structure of roads. Combining these designs, our framework demonstrates superior performance on the DeepGlobe, Wuhan, and Massachusetts datasets outperforming the state-of-the-art techniques by 1.47%, 2.12%, 4.09% respectively in IoU metrics, and showing its potential of plug-and-play. The code will be made publicly available.
