Table of Contents
Fetching ...

Tree species classification at the pixel-level using deep learning and multispectral time series in an imbalanced context

Florian Mouret, David Morin, Milena Planells, Cécile Vincent-Barbaroux

TL;DR

Analysis of tree species classification using the Sentinel-2 multispectral satellite image time series shows that the use of deep learning models can lead to a significant improvement in classification results, especially in an imbalanced context where the RF algorithm tends to predict the majority class.

Abstract

This paper investigates tree species classification using Sentinel-2 multispectral satellite image time-series. Despite their critical importance for many applications, such maps are often unavailable, outdated, or inaccurate for large areas. The interest of using remote sensing time series to produce these maps has been highlighted in many studies. However, many methods proposed in the literature still rely on a standard classification algorithm, usually the Random Forest (RF) algorithm with vegetation indices. This study shows that the use of deep learning models can lead to a significant improvement in classification results, especially in an imbalanced context where the RF algorithm tends to predict towards the majority class. In our use case in the center of France with 10 tree species, we obtain an overall accuracy (OA) around 95% and a F1-macro score around 80% using three different benchmark deep learning architectures. In contrast, using the RF algorithm yields an OA of 93% and an F1 of 60%, indicating that the minority classes are not classified with sufficient accuracy. Therefore, the proposed framework is a strong baseline that can be easily implemented in most scenarios, even with a limited amount of reference data. Our results highlight that standard multilayer perceptron can be competitive with batch normalization and a sufficient amount of parameters. Other architectures (convolutional or attention-based) can also achieve strong results when tuned properly. Furthermore, our results show that DL models are naturally robust to imbalanced data, although similar results can be obtained using dedicated techniques.

Tree species classification at the pixel-level using deep learning and multispectral time series in an imbalanced context

TL;DR

Analysis of tree species classification using the Sentinel-2 multispectral satellite image time series shows that the use of deep learning models can lead to a significant improvement in classification results, especially in an imbalanced context where the RF algorithm tends to predict the majority class.

Abstract

This paper investigates tree species classification using Sentinel-2 multispectral satellite image time-series. Despite their critical importance for many applications, such maps are often unavailable, outdated, or inaccurate for large areas. The interest of using remote sensing time series to produce these maps has been highlighted in many studies. However, many methods proposed in the literature still rely on a standard classification algorithm, usually the Random Forest (RF) algorithm with vegetation indices. This study shows that the use of deep learning models can lead to a significant improvement in classification results, especially in an imbalanced context where the RF algorithm tends to predict towards the majority class. In our use case in the center of France with 10 tree species, we obtain an overall accuracy (OA) around 95% and a F1-macro score around 80% using three different benchmark deep learning architectures. In contrast, using the RF algorithm yields an OA of 93% and an F1 of 60%, indicating that the minority classes are not classified with sufficient accuracy. Therefore, the proposed framework is a strong baseline that can be easily implemented in most scenarios, even with a limited amount of reference data. Our results highlight that standard multilayer perceptron can be competitive with batch normalization and a sufficient amount of parameters. Other architectures (convolutional or attention-based) can also achieve strong results when tuned properly. Furthermore, our results show that DL models are naturally robust to imbalanced data, although similar results can be obtained using dedicated techniques.
Paper Structure (20 sections, 5 figures, 5 tables)

This paper contains 20 sections, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Our study area is delimited in grey, the boundaries between its 11 Sentinel-2 tiles is in lighter grey and administrative departments are outlined in black. The training plots are shown in blue, while the independent validation plots (4 species) are shown in orange.
  • Figure 2: Illustration of different reference plots from our database. Each plot covers different S2 pixels (here the plots are very small, covering 6 to 8 pixels). Each dominant tree species is represented by a different color with the corresponding name.
  • Figure 3: Methodological steps used to map tree species with S2 time series.
  • Figure 4: The different deep learning architecture tested in our analysis. The last layer is a standard linear layer with a number of neurons equal to the number of classes to be predicted and a softmax activation.
  • Figure 5: Normalized confusion matrices averaged after 10 folds CV using SMOTE oversampling.