Leaf clustering using circular densities
Luis E. Nieto-Barajas
TL;DR
This work casts leaf shape characterization via centroid contour distances (CCD) into a density-based framework by modeling CCDs as circular densities, enabling scale normalization and rotation alignment through mean or mode shifts. It compares four density-distance measures ($D_1$–$D_4$) within a complete-linkage hierarchical clustering setup, using both mean and mode rotations, and evaluates performance on a motivating 10-leaf set and the OSU Leaf dataset. Key contributions include a practical normalization pipeline, a quartet of distance metrics (with $D_4$ tied to first $2r$ trigonometric moments), and a heterogeneity-based criterion to compare clusterings on OSU data. The approach yields interpretable clusters for distinct leaf groups and provides ready-to-use code and data for researchers.
Abstract
In the biology field of botany, leaf shape recognition is an important task. One way of characterising the leaf shape is through the centroid contour distances (CCD). Each CCD path might have different resolution, so normalisation is done by associating each contour to a circular density. Densities are rotated by subtracting the mean or mode preferred direction. Distance measures between densities are used to produce a hierarchical clustering method to cluster the leaves. We illustrate our approach with a motivating small dataset as well as a larger dataset.
