VibrantVS: A high-resolution multi-task transformer for forest canopy height estimation

Tony Chang; Kiarie Ndegwa; Andreas Gros; Vincent A. Landau; Luke J. Zachmann; Bogdan State; Mitchell A. Gritts; Colton W. Miller; Nathan E. Rutenbeck; Scott Conway; Guy Bayes

VibrantVS: A high-resolution multi-task transformer for forest canopy height estimation

Tony Chang, Kiarie Ndegwa, Andreas Gros, Vincent A. Landau, Luke J. Zachmann, Bogdan State, Mitchell A. Gritts, Colton W. Miller, Nathan E. Rutenbeck, Scott Conway, Guy Bayes

TL;DR

This work tackles the need for up-to-date, high-resolution forest canopy information to support wildfire risk mitigation and ecological management. It introduces VibrantVS, a high-resolution multi-task Vision Transformer trained on 4-band NAIP imagery to estimate canopy height models (CHMs) and canopy cover, and benchmarks it against three baselines (Meta, LANDFIRE, ETH) across 24 EPA Level 3 ecoregions in the western United States. VibrantVS achieves a clear accuracy and precision advantage, with a median $MAE$ of 2.71 m compared to 4.83–7.05 m for baselines, and provides 0.5 m CHMs, enabling updates every three years or less. The model leverages large, diverse training data, high-resolution inputs, and novel architectural enhancements to support downstream forest structure analyses (e.g., TAO segmentation) and wildfire risk modeling at high spatial fidelity, with practical implications for ecological monitoring and land management.

Abstract

This paper explores the application of a novel multi-task vision transformer (ViT) model for the estimation of canopy height models (CHMs) using 4-band National Agriculture Imagery Program (NAIP) imagery across the western United States. We compare the effectiveness of this model in terms of accuracy and precision aggregated across ecoregions and class heights versus three other benchmark peer-reviewed models. Key findings suggest that, while other benchmark models can provide high precision in localized areas, the VibrantVS model has substantial advantages across a broad reach of ecoregions in the western United States with higher accuracy, higher precision, the ability to generate updated inference at a cadence of three years or less, and high spatial resolution. The VibrantVS model provides significant value for ecological monitoring and land management decisions, including for wildfire mitigation.

VibrantVS: A high-resolution multi-task transformer for forest canopy height estimation

TL;DR

Abstract

VibrantVS: A high-resolution multi-task transformer for forest canopy height estimation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (15)