Mining Field Data for Tree Species Recognition at Scale
Dimitri Gominski, Daniel Ortiz-Gonzalo, Martin Brandt, Maurice Mugabowindekwe, Rasmus Fensholt
TL;DR
This work addresses scalable, species-level identification at the level of individual trees by linking public forest inventory data with high-resolution aerial imagery. It introduces a three-step pipeline: detect individual trees in 15–25 cm imagery using an ensemble of pretrained models, match detections to field plots within a 4 m tolerance, and train a deep classifier on the resulting labels. Key findings show that a ResNet34 classifier can learn from automatically mined labels, and that incorporating noisy and unlabeled data via semi-supervised learning markedly improves class-balanced performance. The approach demonstrates potential for large-scale species mapping across diverse ecosystems and can extend to other forest attributes such as biomass, height, and health.
Abstract
Individual tree species labels are particularly hard to acquire due to the expert knowledge needed and the limitations of photointerpretation. Here, we present a methodology to automatically mine species labels from public forest inventory data, using available pretrained tree detection models. We identify tree instances in aerial imagery and match them with field data with close to zero human involvement. We conduct a series of experiments on the resulting dataset, and show a beneficial effect when adding noisy or even unlabeled data points, highlighting a strong potential for large-scale individual species mapping.
