Generalizations of the Normalized Radon Cumulative Distribution Transform for Limited Data Recognition
Matthias Beckmann, Robert Beinert, Jonas Bresch
TL;DR
The paper addresses affine distortions in image and shape recognition under small data regimes by introducing generalized $h$-normalized NR-CDT variants that extend the Radon-CDT through generalized Radon transforms. It establishes invariance properties and linear separability for affine-transformed classes, and validates these results through extensive 2D and 3D experiments, including 3D object recognition and SO(3) rotation data. Key contributions include max-normalized and $h$-normalized NR-CDTs, multidimensional and circular extensions, and invariance under a wide class of transformations with strong empirical performance. The work offers a practical, scalable feature-extraction framework for invariant recognition and clustering in challenging data-limited scenarios across Euclidean and non-Euclidean domains.
Abstract
The Radon cumulative distribution transform (R-CDT) exploits one-dimensional Wasserstein transport and the Radon transform to represent prominent features in images. It is closely related to the sliced Wasserstein distance and facilitates classification tasks, especially in the small data regime, like the recognition of watermarks in filigranology. Here, a typical issue is that the given data may be subject to affine transformations caused by the measuring process. To make the R-CDT invariant under arbitrary affine transformations, a two-step normalization of the R-CDT has been proposed in our earlier works. The aim of this paper is twofold. First, we propose a family of generalized normalizations to enhance flexibility for applications. Second, we study multi-dimensional and non-Euclidean settings by making use of generalized Radon transforms. We prove that our novel feature representations are invariant under certain transformations and allow for linear separation in feature space. Our theoretical results are supported by numerical experiments based on 2d images, 3d shapes and 3d rotation matrices, showing near perfect classification accuracies and clustering results.
