Table of Contents
Fetching ...

Transferring Physical Priors into Remote Sensing Segmentation via Large Language Models

Yuxi Lu, Kunqi Li, Zhidong Li, Xiaohan Su, Biao Wu, Chenya Huang, Bin Liang

Abstract

Semantic segmentation of remote sensing imagery is fundamental to Earth observation. Achieving accurate results requires integrating not only optical images but also physical variables such as the Digital Elevation Model (DEM), Synthetic Aperture Radar (SAR) and Normalized Difference Vegetation Index (NDVI). Recent foundation models (FMs) leverage pre-training to exploit these variables but still depend on spatially aligned data and costly retraining when involving new sensors. To overcome these limitations, we introduce a novel paradigm for integrating domain-specific physical priors into segmentation models. We first construct a Physical-Centric Knowledge Graph (PCKG) by prompting large language models to extract physical priors from 1,763 vocabularies, and use it to build a heterogeneous, spatial-aligned dataset, Phy-Sky-SA. Building on this foundation, we develop PriorSeg, a physics-aware residual refinement model trained with a joint visual-physical strategy that incorporates a novel physics-consistency loss. Experiments on heterogeneous settings demonstrate that PriorSeg improves segmentation accuracy and physical plausibility without retraining the FMs. Ablation studies verify the effectiveness of the Phy-Sky-SA dataset, the PCKG, and the physics-consistency loss.

Transferring Physical Priors into Remote Sensing Segmentation via Large Language Models

Abstract

Semantic segmentation of remote sensing imagery is fundamental to Earth observation. Achieving accurate results requires integrating not only optical images but also physical variables such as the Digital Elevation Model (DEM), Synthetic Aperture Radar (SAR) and Normalized Difference Vegetation Index (NDVI). Recent foundation models (FMs) leverage pre-training to exploit these variables but still depend on spatially aligned data and costly retraining when involving new sensors. To overcome these limitations, we introduce a novel paradigm for integrating domain-specific physical priors into segmentation models. We first construct a Physical-Centric Knowledge Graph (PCKG) by prompting large language models to extract physical priors from 1,763 vocabularies, and use it to build a heterogeneous, spatial-aligned dataset, Phy-Sky-SA. Building on this foundation, we develop PriorSeg, a physics-aware residual refinement model trained with a joint visual-physical strategy that incorporates a novel physics-consistency loss. Experiments on heterogeneous settings demonstrate that PriorSeg improves segmentation accuracy and physical plausibility without retraining the FMs. Ablation studies verify the effectiveness of the Phy-Sky-SA dataset, the PCKG, and the physics-consistency loss.

Paper Structure

This paper contains 19 sections, 8 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Visual and visual-physical segmentation: adding the physical prior (simulated SAR) preserves the true object while preventing the shadow from being segmented.
  • Figure 2: Schematic of PCKG construction with GPT-4o. Category labels are prompted to the LLM, which returns NDVI, DEM, and SAR intervals $[a,b]$ plus a brief reasoning trace. The results are stored as JSON records and aggregated into a lightweight Physics-Centric Knowledge Graph that later guides synthetic-modal generation and loss design.
  • Figure 3: Example sample from the Phy-Sky-SA dataset. From left to right: Image, GT, and the three simulated physical variables-NDVI(S), DEM(S), and SAR(S)-generated according to the PCKG, where the “(S)” denotes simulated values.
  • Figure 4: Overview of the PriorSeg training framework. A frozen backbone extracts visual features and an initial mask, which are fused with optional physical variables. A residual head refines the segmentation, and three losses jointly enforce visual quality and physical consistency.
  • Figure 5: Three representative visualization examples on SegEarth-OV backbone, comparing GT, SegEarth-OV predictions, and our PriorSeg-enhanced results.
  • ...and 1 more figures