Graph Information Bottleneck for Remote Sensing Segmentation
Yuntao Shou, Wei Ai, Tao Meng, Nan Yin
TL;DR
This work views remote sensing images as graphs to better model irregular objects, addressing limitations of CNN/Transformer approaches and conventional graph contrastive learning. It introduces SC-ViG, a simple contrastive vision GNN with adaptive node and edge masking, and embeds information bottleneck theory to maximize task-relevant information while minimizing redundancy. The GIB-RSS framework replaces UNet convolutions with graph-based modules and is trained end-to-end with IB-guided losses, achieving state-of-the-art segmentation performance across UAVid, Vaihingen, Potsdam, and LoveDA datasets. The approach demonstrates that flexible graph representations and IB-driven contrastive learning can improve segmentation quality and generalization for remote sensing imagery, with potential for zero-shot extensions using large-scale pre-trained models.
Abstract
Remote sensing segmentation has a wide range of applications in environmental protection, and urban change detection, etc. Despite the success of deep learning-based remote sensing segmentation methods (e.g., CNN and Transformer), they are not flexible enough to model irregular objects. In addition, existing graph contrastive learning methods usually adopt the way of maximizing mutual information to keep the node representations consistent between different graph views, which may cause the model to learn task-independent redundant information. To tackle the above problems, this paper treats images as graph structures and introduces a simple contrastive vision GNN (SC-ViG) architecture for remote sensing segmentation. Specifically, we construct a node-masked and edge-masked graph view to obtain an optimal graph structure representation, which can adaptively learn whether to mask nodes and edges. Furthermore, this paper innovatively introduces information bottleneck theory into graph contrastive learning to maximize task-related information while minimizing task-independent redundant information. Finally, we replace the convolutional module in UNet with the SC-ViG module to complete the segmentation and classification tasks of remote sensing images. Extensive experiments on publicly available real datasets demonstrate that our method outperforms state-of-the-art remote sensing image segmentation methods.
