Explainable Binary Classification of Separable Shape Ensembles
Zachary Grey, Nicholas Fisher, Andrew Glaws
TL;DR
This work presents a framework for explainable binary classification of large ensembles of boundary curves by formulating Separable Shape Tensors (SST) that separate generalized scale from undulation while preserving rigid invariances. It ground SST in a dual RKHS interpretation via a Hilbert-Schmidt integral operator, enabling a finite, efficient discretization through Nyström-based SRQD and projections onto the Grassmannian and SPD matrix manifolds. The methodology yields interpretable, statistically testable features (t, ell) whose distributions across image ensembles are compared with product maximum mean discrepancy (pMMD), without requiring labeled training data. Through EBSD and battery SEM experiments, the approach demonstrates sensitivity to subvisual differences in shape, robustness to segmentation variations, and practical guidance for parameter settings, highlighting its potential as a scalable, explainable tool for image-based shape analysis.
Abstract
Scientists, engineers, biologists, and technology specialists universally leverage image segmentation to extract shape ensembles containing many thousands of curves representing patterns in observations and measurements. These large curve ensembles facilitate inferences about important changes when comparing and contrasting images. We introduce novel pattern recognition formalisms combined with inference methods over large ensembles of segmented curves. Our formalism involves accurately approximating eigenspaces of composite integral operators to motivate discrete, dual representations of curves collocated at quadrature nodes. Approximations are projected onto underlying matrix manifolds and the resulting separable shape tensors constitute rigid-invariant decompositions of curves into generalized (linear) scale variations and complementary (nonlinear) undulations. With thousands of curves segmented from pairs of images, we demonstrate how data-driven features of separable shape tensors inform explainable binary classification utilizing a product maximum mean discrepancy; absent labeled data, building interpretable feature spaces in seconds without high performance computation, and detecting discrepancies below cursory visual inspections.
