Table of Contents
Fetching ...

ViGEO: an Assessment of Vision GNNs in Earth Observation

Luca Colomba, Paolo Garza

TL;DR

The paper investigates Vision GNNs for Earth Observation land-cover classification by adapting ViG to multispectral, low-resolution remote-sensing data and evaluating it on three benchmarks (RESISC45, PatternNet, BigEarthNet). ViG achieves strong, often state-of-the-art results in both multiclass and multilabel tasks, outperforming ViT and ResNet while using a compact parameter budget. The work demonstrates the efficacy of graph-based image representations in EO and provides reproducible code to facilitate further exploration in multispectral remote sensing, GNNs, and land-monitoring applications. The authors also outline future directions toward multimodal data fusion and object detection in the EO domain, highlighting practical impact for environmental monitoring and disaster response.

Abstract

Satellite missions and Earth Observation (EO) systems represent fundamental assets for environmental monitoring and the timely identification of catastrophic events, long-term monitoring of both natural resources and human-made assets, such as vegetation, water bodies, forests as well as buildings. Different EO missions enables the collection of information on several spectral bandwidths, such as MODIS, Sentinel-1 and Sentinel-2. Thus, given the recent advances of machine learning, computer vision and the availability of labeled data, researchers demonstrated the feasibility and the precision of land-use monitoring systems and remote sensing image classification through the use of deep neural networks. Such systems may help domain experts and governments in constant environmental monitoring, enabling timely intervention in case of catastrophic events (e.g., forest wildfire in a remote area). Despite the recent advances in the field of computer vision, many works limit their analysis on Convolutional Neural Networks (CNNs) and, more recently, to vision transformers (ViTs). Given the recent successes of Graph Neural Networks (GNNs) on non-graph data, such as time-series and images, we investigate the performances of a recent Vision GNN architecture (ViG) applied to the task of land cover classification. The experimental results show that ViG achieves state-of-the-art performances in multiclass and multilabel classification contexts, surpassing both ViT and ResNet on large-scale benchmarks.

ViGEO: an Assessment of Vision GNNs in Earth Observation

TL;DR

The paper investigates Vision GNNs for Earth Observation land-cover classification by adapting ViG to multispectral, low-resolution remote-sensing data and evaluating it on three benchmarks (RESISC45, PatternNet, BigEarthNet). ViG achieves strong, often state-of-the-art results in both multiclass and multilabel tasks, outperforming ViT and ResNet while using a compact parameter budget. The work demonstrates the efficacy of graph-based image representations in EO and provides reproducible code to facilitate further exploration in multispectral remote sensing, GNNs, and land-monitoring applications. The authors also outline future directions toward multimodal data fusion and object detection in the EO domain, highlighting practical impact for environmental monitoring and disaster response.

Abstract

Satellite missions and Earth Observation (EO) systems represent fundamental assets for environmental monitoring and the timely identification of catastrophic events, long-term monitoring of both natural resources and human-made assets, such as vegetation, water bodies, forests as well as buildings. Different EO missions enables the collection of information on several spectral bandwidths, such as MODIS, Sentinel-1 and Sentinel-2. Thus, given the recent advances of machine learning, computer vision and the availability of labeled data, researchers demonstrated the feasibility and the precision of land-use monitoring systems and remote sensing image classification through the use of deep neural networks. Such systems may help domain experts and governments in constant environmental monitoring, enabling timely intervention in case of catastrophic events (e.g., forest wildfire in a remote area). Despite the recent advances in the field of computer vision, many works limit their analysis on Convolutional Neural Networks (CNNs) and, more recently, to vision transformers (ViTs). Given the recent successes of Graph Neural Networks (GNNs) on non-graph data, such as time-series and images, we investigate the performances of a recent Vision GNN architecture (ViG) applied to the task of land cover classification. The experimental results show that ViG achieves state-of-the-art performances in multiclass and multilabel classification contexts, surpassing both ViT and ResNet on large-scale benchmarks.
Paper Structure (15 sections, 2 equations, 2 figures, 5 tables)

This paper contains 15 sections, 2 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Two samples extracted from RESISC45 and splitted into 16 patches belonging to class "Lake" (\ref{['subfig:lake']}) and class "Snowberg" (\ref{['subfig:snow']}) respectively. Red arrows represent possible dynamically-created edges with $K = 4$ in Grapher layer according to patch embeddings.
  • Figure 2: ViG's modified architecture. ViG Encoder block shows one single step of the encoder module, which is repeated three times. The downsample module is absent in the third encoder block. "Clf head" stands for "Classification head".