Table of Contents
Fetching ...

Graph Information Bottleneck for Remote Sensing Segmentation

Yuntao Shou, Wei Ai, Tao Meng, Nan Yin

TL;DR

This work views remote sensing images as graphs to better model irregular objects, addressing limitations of CNN/Transformer approaches and conventional graph contrastive learning. It introduces SC-ViG, a simple contrastive vision GNN with adaptive node and edge masking, and embeds information bottleneck theory to maximize task-relevant information while minimizing redundancy. The GIB-RSS framework replaces UNet convolutions with graph-based modules and is trained end-to-end with IB-guided losses, achieving state-of-the-art segmentation performance across UAVid, Vaihingen, Potsdam, and LoveDA datasets. The approach demonstrates that flexible graph representations and IB-driven contrastive learning can improve segmentation quality and generalization for remote sensing imagery, with potential for zero-shot extensions using large-scale pre-trained models.

Abstract

Remote sensing segmentation has a wide range of applications in environmental protection, and urban change detection, etc. Despite the success of deep learning-based remote sensing segmentation methods (e.g., CNN and Transformer), they are not flexible enough to model irregular objects. In addition, existing graph contrastive learning methods usually adopt the way of maximizing mutual information to keep the node representations consistent between different graph views, which may cause the model to learn task-independent redundant information. To tackle the above problems, this paper treats images as graph structures and introduces a simple contrastive vision GNN (SC-ViG) architecture for remote sensing segmentation. Specifically, we construct a node-masked and edge-masked graph view to obtain an optimal graph structure representation, which can adaptively learn whether to mask nodes and edges. Furthermore, this paper innovatively introduces information bottleneck theory into graph contrastive learning to maximize task-related information while minimizing task-independent redundant information. Finally, we replace the convolutional module in UNet with the SC-ViG module to complete the segmentation and classification tasks of remote sensing images. Extensive experiments on publicly available real datasets demonstrate that our method outperforms state-of-the-art remote sensing image segmentation methods.

Graph Information Bottleneck for Remote Sensing Segmentation

TL;DR

This work views remote sensing images as graphs to better model irregular objects, addressing limitations of CNN/Transformer approaches and conventional graph contrastive learning. It introduces SC-ViG, a simple contrastive vision GNN with adaptive node and edge masking, and embeds information bottleneck theory to maximize task-relevant information while minimizing redundancy. The GIB-RSS framework replaces UNet convolutions with graph-based modules and is trained end-to-end with IB-guided losses, achieving state-of-the-art segmentation performance across UAVid, Vaihingen, Potsdam, and LoveDA datasets. The approach demonstrates that flexible graph representations and IB-driven contrastive learning can improve segmentation quality and generalization for remote sensing imagery, with potential for zero-shot extensions using large-scale pre-trained models.

Abstract

Remote sensing segmentation has a wide range of applications in environmental protection, and urban change detection, etc. Despite the success of deep learning-based remote sensing segmentation methods (e.g., CNN and Transformer), they are not flexible enough to model irregular objects. In addition, existing graph contrastive learning methods usually adopt the way of maximizing mutual information to keep the node representations consistent between different graph views, which may cause the model to learn task-independent redundant information. To tackle the above problems, this paper treats images as graph structures and introduces a simple contrastive vision GNN (SC-ViG) architecture for remote sensing segmentation. Specifically, we construct a node-masked and edge-masked graph view to obtain an optimal graph structure representation, which can adaptively learn whether to mask nodes and edges. Furthermore, this paper innovatively introduces information bottleneck theory into graph contrastive learning to maximize task-related information while minimizing task-independent redundant information. Finally, we replace the convolutional module in UNet with the SC-ViG module to complete the segmentation and classification tasks of remote sensing images. Extensive experiments on publicly available real datasets demonstrate that our method outperforms state-of-the-art remote sensing image segmentation methods.
Paper Structure (25 sections, 19 equations, 6 figures, 8 tables)

This paper contains 25 sections, 19 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: Illustrative examples of different modeling approaches for an image. (a) CNNs view images as regular grid structures (i.e., squares and rectangles). (b) Transformer treats images as a continuous sequence structure. (c) We believe that both sequence structure and grid structure are special cases of graph structure, and graph structure can flexibly model regular and irregular objects. We thus view images as graph structures.
  • Figure 2: The architecture of the proposed GraphUNet method. Specifically, we first divide the image into patches and construct it as a graph. Then we replace the convolutional block in UNet with our GCN Block and use the constructed graph as the input. Finally, we build a MLP to classify pixels.
  • Figure 3: The overview the GCN Embedded Block framework. We synthesize node-mask and edge-mask views to obtain better node representations. Specifically, we introduce information bottleneck theory in multiple graph comparison views to maximize feature information related to node classification tasks while minimizing redundant information of nodes.
  • Figure 4: Visualization of the segmentation results of different models on the Postdam dataset.
  • Figure 5: Visualization of the segmentation results of different models on the Vaihingen dataset.
  • ...and 1 more figures