Riesz feature representation: scale equivariant scattering network for classification tasks
Tin Barisin, Jesus Angulo, Katja Schladitz, Claudia Redenbach
TL;DR
The paper tackles scale sensitivity in traditional scattering descriptors by introducing a Riesz-transform–based feature representation that is scale-equivariant and avoids explicit scale sampling. It constructs a hierarchical, nonexpansive representation from first- and higher-order Riesz transforms and a steerable base function, culminating in a compact 85-feature descriptor that achieves scale generalization through global pooling. Empirical results on MNIST Large Scale, KTH-tips, and CIFAR-10 demonstrate robust performance under unseen scales and competitive texture/digit classification, while highlighting advantages in data efficiency and stability over purely data-driven deep nets. The work also points to promising hybrid integrations with CNNs to combine discriminative power with scale-robustness, and outlines avenues for scale-aware bounding boxes and extended applications.
Abstract
Scattering networks yield powerful and robust hierarchical image descriptors which do not require lengthy training and which work well with very few training data. However, they rely on sampling the scale dimension. Hence, they become sensitive to scale variations and are unable to generalize to unseen scales. In this work, we define an alternative feature representation based on the Riesz transform. We detail and analyze the mathematical foundations behind this representation. In particular, it inherits scale equivariance from the Riesz transform and completely avoids sampling of the scale dimension. Additionally, the number of features in the representation is reduced by a factor four compared to scattering networks. Nevertheless, our representation performs comparably well for texture classification with an interesting addition: scale equivariance. Our method yields superior performance when dealing with scales outside of those covered by the training dataset. The usefulness of the equivariance property is demonstrated on the digit classification task, where accuracy remains stable even for scales four times larger than the one chosen for training. As a second example, we consider classification of textures.
