Table of Contents
Fetching ...

Assessing 3D tree model quality and species classification using imbalance indices

Sophie J. Kersting, Mareike Fischer

Abstract

We investigate the use of additional 3D and phylogenetic non-3D tree balance indices for analyzing and monitoring forests using an exemplary "virtual forest" dataset from the Wytham Woods, Oxford, UK. This study assesses 3D model quality, species classification performance, and the relevance of these indices. Our study shows that indices stemming from the study of ancestry trees of species can be successfully applied to 3D models of organic trees and, accompanied with recently introduced 3D imbalance indices, offer a complementary perspective on 3D tree models and improve the detection of deviations. Their computational efficiency combined with the simple and reproducible workflow presented in this manuscript form a computationally feasible quality control step in the 3D model construction. Species classification models reached an estimated accuracy of up to 81.8% and allowed to make confident species predictions for a large portion of the unlabeled trees in the dataset. While conventional tree metrics can already provide strong predictive performance, the addition of filtered 3D and non-3D statistics improved results consistently, particularly for minority species classes. Alongside this manuscript, we provide updated functionality in the R package treeDbalance to include the necessary functionalities and release the derived index datasets and species predictions.

Assessing 3D tree model quality and species classification using imbalance indices

Abstract

We investigate the use of additional 3D and phylogenetic non-3D tree balance indices for analyzing and monitoring forests using an exemplary "virtual forest" dataset from the Wytham Woods, Oxford, UK. This study assesses 3D model quality, species classification performance, and the relevance of these indices. Our study shows that indices stemming from the study of ancestry trees of species can be successfully applied to 3D models of organic trees and, accompanied with recently introduced 3D imbalance indices, offer a complementary perspective on 3D tree models and improve the detection of deviations. Their computational efficiency combined with the simple and reproducible workflow presented in this manuscript form a computationally feasible quality control step in the 3D model construction. Species classification models reached an estimated accuracy of up to 81.8% and allowed to make confident species predictions for a large portion of the unlabeled trees in the dataset. While conventional tree metrics can already provide strong predictive performance, the addition of filtered 3D and non-3D statistics improved results consistently, particularly for minority species classes. Alongside this manuscript, we provide updated functionality in the R package treeDbalance to include the necessary functionalities and release the derived index datasets and species predictions.
Paper Structure (44 sections, 3 equations, 13 figures, 13 tables)

This paper contains 44 sections, 3 equations, 13 figures, 13 tables.

Figures (13)

  • Figure 2.1: A visualization of the dataset and the workflow: Each tree has ten different 3D tree models (in the format of a QSM or a rooted 3D tree model) as well as the ten corresponding extracted non-3D topologies, here exemplarily depicted for the sycamore (ACERPS) with ID 180b. The rooted 3D trees are colored according to their internal $\mathcal{A}$ imbalance (see Section \ref{['sec:statistics3DT']}), where lighter blue colors show low and darker red tones higher imbalance. This helps to highlight some common flaws/uncertainties of QSMs: cylinders forming a bend/angle in the stem (e.g. 180b$^7$), protruding and non-connecting cylinders (e.g. 180b$^3$ and 180b$^4$) or the absence of certain tree parts (e.g. 180b$^0$). This example was chosen since it is still small and clear (height of $\approx 3.5$ m and $9$ to $14$ tips/leaves in the non-3D topologies) in contrast to the average tree height ($\approx 14$ m) and average number of leaves (791) in the dataset. Tree 180b has a sister stem 180a (height $\approx 25.3$ m, with around 1526 leaves). Both are depicted on the left side with a human figure for scale. This also explains why 180b is leaning to the side / has a high external imbalance. In the QSM quality assessment process, 180b$^2$ was chosen as the best and 180b$^9$ as the worst QSM (QSM iterations 2, 6, 5, and 4$\in B_{\text{180b}}$ were candidates for the best QSM, while 1, 7, 8, 3, 0, and 9 were filtered out in the negative selection since they each were outlier-QSMs for 1 to 11 statistics).
  • Figure 2.2: Species distribution across tree height. The barplot in the background shows the total number of trees per height class and the points the respective numbers for the individual species. The points are shaped as the leaves of the corresponding tree species and are only depicted for values $>0$. The legend also provides the respective total number of trees per species (N). The colors for the species are the same as in calders_laser_2022 to allow easy transfer between visualized results between both studies.
  • Figure 2.3: Two exemplary point clouds, on the left by themselves, then in the middle with their respective "best" QSM as decided by our method, and on the right with a flawed QSM. (a) Point cloud of tree 180b consisting of 3,803 points, with QSMs 180b$^2$ (best) and 180b$^8$ with 63 and 42 cylinders, respectively. (b) Point cloud of tree 145c consisting of 3,039 points, with QSMs 145c$^8$ (best) and 145c$^5$ with 66 and 73 cylinders, respectively. The QSMs are colored according to their internal $\mathcal{A}$ imbalance (see Section \ref{['sec:statistics3DT']}).
  • Figure 3.1: Exemplary results of the QSM quality assessment for (a) the sycamore (ACERPS) with ID 145c (145c$^8$ has height 3.42 m, DBH 0.05 m and 16 leaves), (b) the sycamore with ID 8161b (8161b$^1$ has height 5.26 m, DBH 0.04 m and 23 leaves), and (c) the common hazel (CORYAV) with ID 8177 (8177$^1$ has height 5.9 m, DBH 0.06 m and 292 leaves). The QSMs are sorted by outlier count and then quality score. The QSMs are colored according to their internal $\mathcal{A}$ imbalance (see Section \ref{['sec:statistics3DT']}).
  • Figure 3.2: Species predictions of the unlabeled trees based on the final RF and GB models iii). Each tree is shown with its corresponding most probable of the four species labels, i.e., the corresponding probability must be $\geq 25\%$ (for this data it is $>0.3$). Transparency provides insight into the probability/certainty of the predictions. Predictions with probability $\geq0.9$ are marked with a $\star$. Note that the class Other comprises of QUERRO, CRATMO, and ACERCA. Class counts for RF/GB: ACERPS 83/73, CORYAV 26/30, FRAXEX 1/7, and Other 8/8.
  • ...and 8 more figures