Table of Contents
Fetching ...

Preserving Vertical Structure in 3D-to-2D Projection for Permafrost Thaw Mapping

Justin McMillen, Robert Van Alphen, Taha Sadeghi Chorsi, Jason Shabaga, Mel Rodgers, Rocco Malservisi, Timothy Dixon, Yasin Yilmaz

Abstract

Forecasting permafrost thaw from aerial lidar requires projecting 3D point cloud features onto 2D prediction grids, yet naive aggregation methods destroy the vertical structure critical in forest environments where ground, understory, and canopy carry distinct information about subsurface conditions. We propose a projection decoder with learned height embeddings that enable height-dependent feature transformations, allowing the network to differentiate ground-level signals from canopy returns. Combined with stratified sampling that ensures all forest strata remain represented, our approach preserves the vertical information critical for predicting subsurface conditions. Our approach pairs this decoder with a Point Transformer V3 encoder to predict dense thaw depth maps from drone-collected lidar over boreal forest in interior Alaska. Experiments demonstrate that z-stratified projection outperforms standard averaging-based methods, particularly in areas with complex vertical vegetation structure. Our method enables scalable, high-resolution monitoring of permafrost degradation from readily deployable UAV platforms.

Preserving Vertical Structure in 3D-to-2D Projection for Permafrost Thaw Mapping

Abstract

Forecasting permafrost thaw from aerial lidar requires projecting 3D point cloud features onto 2D prediction grids, yet naive aggregation methods destroy the vertical structure critical in forest environments where ground, understory, and canopy carry distinct information about subsurface conditions. We propose a projection decoder with learned height embeddings that enable height-dependent feature transformations, allowing the network to differentiate ground-level signals from canopy returns. Combined with stratified sampling that ensures all forest strata remain represented, our approach preserves the vertical information critical for predicting subsurface conditions. Our approach pairs this decoder with a Point Transformer V3 encoder to predict dense thaw depth maps from drone-collected lidar over boreal forest in interior Alaska. Experiments demonstrate that z-stratified projection outperforms standard averaging-based methods, particularly in areas with complex vertical vegetation structure. Our method enables scalable, high-resolution monitoring of permafrost degradation from readily deployable UAV platforms.
Paper Structure (41 sections, 9 equations, 11 figures, 10 tables)

This paper contains 41 sections, 9 equations, 11 figures, 10 tables.

Figures (11)

  • Figure 1: Orthophoto field map of Farmer's Loop 2 field site. Extent of the orhtophoto matches the extent of the lidar data. Inset: Location of Fairbanks, AK where our field site is located.
  • Figure 2: md-Lidar1000HR UAV with lidar and RGB camera.
  • Figure 3: Model architecture overview. A point cloud with per-point features (XYZ, RGB, intensity) is processed by a Point Transformer V3 encoder, which produces multi-scale point features at four hierarchical stages. Each stage is independently projected to a 2D feature map via our height-aware projection mechanism with learned z-embeddings (see Fig. \ref{['fig:z_profile']}), preserving vertical forest structure during the 3D-to-2D transformation. The resulting feature maps are concatenated and fused through $1\times1$ convolutions. A lightweight convolutional head produces the final per-pixel thaw depth prediction, supporting both regression and classification formulations.
  • Figure 4: Height-aware projection mechanism. (a) Input point cloud with vertical forest structure (ground, understory, canopy) above a query grid cell $q_j$. (b) From XY-nearest candidates ($M = 2k$), farthest point sampling in the z-dimension selects $k$ points that span the full vertical extent. (c) Selected points are sorted by height and augmented with learned z-embeddings that enable height-dependent feature transformations. (d) The concatenated profile vector ($k \times D$) is projected through an MLP to produce a single aggregated feature. (e) This process repeats for all query locations, yielding an $H \times W \times D$ feature map.
  • Figure 5: Qualitative comparison. Each row: input point cloud, regression ground truth, regression prediction, classification ground truth, classification prediction.
  • ...and 6 more figures