Table of Contents
Fetching ...

CATSNet: a context-aware network for Height Estimation in a Forested Area based on Pol-TomoSAR data

Wenyu Yang, Sergio Vitale, Hossein Aghababaei, Giampaolo Ferraioli, Vito Pascazio, Gilda Schirinzi

TL;DR

The paper tackles height estimation of forest structure from Pol-TomoSAR data, addressing limitations of pixel-wise approaches. It introduces CATSNet, a context-aware patch-based CNN (U-Net) that ingests covariance-matrix patches and uses LiDAR heights as ground truth to segment forest and ground heights. Results on Paracou and Lopè demonstrate that CATSNet outperforms traditional TomoSAR methods and the pixel-wise TSNN, with strong generalization across full, dual, and single polarization data and effective adaptation to new areas via fine-tuning. A unified CATSNet variant offers a computation-efficient alternative with competitive accuracy, highlighting practical applicability for upcoming missions like BIOMASS.

Abstract

Tropical forests are a key component of the global carbon cycle. With plans for upcoming space-borne missions like BIOMASS to monitor forestry, several airborne missions, including TropiSAR and AfriSAR campaigns, have been successfully launched and experimented. Typical Synthetic Aperture Radar Tomography (TomoSAR) methods involve complex models with low accuracy and high computation costs. In recent years, deep learning methods have also gained attention in the TomoSAR framework, showing interesting performance. Recently, a solution based on a fully connected Tomographic Neural Network (TSNN) has demonstrated its effectiveness in accurately estimating forest and ground heights by exploiting the pixel-wise elements of the covariance matrix derived from TomoSAR data. This work instead goes beyond the pixel-wise approach to define a context-aware deep learning-based solution named CATSNet. A convolutional neural network is considered to leverage patch-based information and extract features from a neighborhood rather than focus on a single pixel. The training is conducted by considering TomoSAR data as the input and Light Detection and Ranging (LiDAR) values as the ground truth. The experimental results show striking advantages in both performance and generalization ability by leveraging context information within Multiple Baselines (MB) TomoSAR data across different polarimetric modalities, surpassing existing techniques.

CATSNet: a context-aware network for Height Estimation in a Forested Area based on Pol-TomoSAR data

TL;DR

The paper tackles height estimation of forest structure from Pol-TomoSAR data, addressing limitations of pixel-wise approaches. It introduces CATSNet, a context-aware patch-based CNN (U-Net) that ingests covariance-matrix patches and uses LiDAR heights as ground truth to segment forest and ground heights. Results on Paracou and Lopè demonstrate that CATSNet outperforms traditional TomoSAR methods and the pixel-wise TSNN, with strong generalization across full, dual, and single polarization data and effective adaptation to new areas via fine-tuning. A unified CATSNet variant offers a computation-efficient alternative with competitive accuracy, highlighting practical applicability for upcoming missions like BIOMASS.

Abstract

Tropical forests are a key component of the global carbon cycle. With plans for upcoming space-borne missions like BIOMASS to monitor forestry, several airborne missions, including TropiSAR and AfriSAR campaigns, have been successfully launched and experimented. Typical Synthetic Aperture Radar Tomography (TomoSAR) methods involve complex models with low accuracy and high computation costs. In recent years, deep learning methods have also gained attention in the TomoSAR framework, showing interesting performance. Recently, a solution based on a fully connected Tomographic Neural Network (TSNN) has demonstrated its effectiveness in accurately estimating forest and ground heights by exploiting the pixel-wise elements of the covariance matrix derived from TomoSAR data. This work instead goes beyond the pixel-wise approach to define a context-aware deep learning-based solution named CATSNet. A convolutional neural network is considered to leverage patch-based information and extract features from a neighborhood rather than focus on a single pixel. The training is conducted by considering TomoSAR data as the input and Light Detection and Ranging (LiDAR) values as the ground truth. The experimental results show striking advantages in both performance and generalization ability by leveraging context information within Multiple Baselines (MB) TomoSAR data across different polarimetric modalities, surpassing existing techniques.
Paper Structure (14 sections, 3 equations, 13 figures, 5 tables)

This paper contains 14 sections, 3 equations, 13 figures, 5 tables.

Figures (13)

  • Figure 1: Workflow of the proposed method. For each range-azimuth pixel of the MPMB Tomostack, the $\Phi \times N$ real diagonal elements of the matrix $R$ and the real parts and the imaginary parts of elements in the first are stacked and considered as the channels of the input patch. The annotations consist of the corresponding quantized LiDAR-based heights: LiDAR-based CHM in the case of training targeted for prediction of forest height; and LiDAR-based DTM in the case of training targeted for prediction of ground height.
  • Figure 2: Architecture of CATSNet. The architecture is composed of a contracting encoder path (left side) and an expansive decoder path (right side). The encoder path consists of five downsampling levels and the decoder consists of five upsampling levels.
  • Figure 3: The geographic location of Paracou site (left) and the image coverage (right).
  • Figure 4: The geographic location of Lopè site (left) and the image coverage (right).
  • Figure 5: Pauli RGB images of testing patch extracted by Paracou (left) and Lopè areas (right).
  • ...and 8 more figures