Table of Contents
Fetching ...

OpenForest: A data catalogue for machine learning in forest monitoring

Arthur Ouaknine, Teja Kattenborn, Etienne Laliberté, David Rolnick

TL;DR

A comprehensive and extensive overview of 86 open-access forest datasets across spatial scales, encompassing inventories, ground-based, aerial-based, satellite-based recordings, and country or world maps are provided.

Abstract

Forests play a crucial role in Earth's system processes and provide a suite of social and economic ecosystem services, but are significantly impacted by human activities, leading to a pronounced disruption of the equilibrium within ecosystems. Advancing forest monitoring worldwide offers advantages in mitigating human impacts and enhancing our comprehension of forest composition, alongside the effects of climate change. While statistical modeling has traditionally found applications in forest biology, recent strides in machine learning and computer vision have reached important milestones using remote sensing data, such as tree species identification, tree crown segmentation and forest biomass assessments. For this, the significance of open access data remains essential in enhancing such data-driven algorithms and methodologies. Here, we provide a comprehensive and extensive overview of 86 open access forest datasets across spatial scales, encompassing inventories, ground-based, aerial-based, satellite-based recordings, and country or world maps. These datasets are grouped in OpenForest, a dynamic catalogue open to contributions that strives to reference all available open access forest datasets. Moreover, in the context of these datasets, we aim to inspire research in machine learning applied to forest biology by establishing connections between contemporary topics, perspectives and challenges inherent in both domains. We hope to encourage collaborations among scientists, fostering the sharing and exploration of diverse datasets through the application of machine learning methods for large-scale forest monitoring. OpenForest is available at https://github.com/RolnickLab/OpenForest .

OpenForest: A data catalogue for machine learning in forest monitoring

TL;DR

A comprehensive and extensive overview of 86 open-access forest datasets across spatial scales, encompassing inventories, ground-based, aerial-based, satellite-based recordings, and country or world maps are provided.

Abstract

Forests play a crucial role in Earth's system processes and provide a suite of social and economic ecosystem services, but are significantly impacted by human activities, leading to a pronounced disruption of the equilibrium within ecosystems. Advancing forest monitoring worldwide offers advantages in mitigating human impacts and enhancing our comprehension of forest composition, alongside the effects of climate change. While statistical modeling has traditionally found applications in forest biology, recent strides in machine learning and computer vision have reached important milestones using remote sensing data, such as tree species identification, tree crown segmentation and forest biomass assessments. For this, the significance of open access data remains essential in enhancing such data-driven algorithms and methodologies. Here, we provide a comprehensive and extensive overview of 86 open access forest datasets across spatial scales, encompassing inventories, ground-based, aerial-based, satellite-based recordings, and country or world maps. These datasets are grouped in OpenForest, a dynamic catalogue open to contributions that strives to reference all available open access forest datasets. Moreover, in the context of these datasets, we aim to inspire research in machine learning applied to forest biology by establishing connections between contemporary topics, perspectives and challenges inherent in both domains. We hope to encourage collaborations among scientists, fostering the sharing and exploration of diverse datasets through the application of machine learning methods for large-scale forest monitoring. OpenForest is available at https://github.com/RolnickLab/OpenForest .
Paper Structure (40 sections, 3 figures, 9 tables)

This paper contains 40 sections, 3 figures, 9 tables.

Figures (3)

  • Figure 1: Overview of forest monitoring topics and challenges associated to machine learning perspectives and challenges. Each forest monitoring topics and challenges are detailed with their corresponding section number (in red). They are associated to the three main machine learning perspectives and challenges categories, namely generalization, limited data and domain-specific objectives, alongside with their corresponding section number (in red)
  • Figure 2: Illustration of forest monitoring datasets at different scales. Inventories are in situ measurements realised at the tree level. Ground-based datasets are recorded within or below the canopy of the trees. Aerial datasets are composed of recordings from sensors mounted on unoccupied (drones) or occupied aircrafts. Satellite datasets are collected from sensors mounted on satellites orbiting the Earth. Map datasets are generated at the country or world level using datasets at the aerial or satellite scales
  • Figure 3: Distribution of the reviewed open access forest datasets. (Left) World map of the location of the reviewed datasets at the country level. Most of the datasets are regional and do not reflect the entire associated country. The datasets categorized with a 'Worldwide' location or at the continent level have been excluded for visualization purposes. (Right) Distributions of the publication years and recording years used and / or released in the associated datasets