Table of Contents
Fetching ...

Riesz networks: scale invariant neural networks in a single forward pass

Tin Barisin, Katja Schladitz, Claudia Redenbach

TL;DR

The paper addresses the challenge of scale variation in image analysis by introducing the Riesz network, a scale-equivariant neural network built on the Riesz transform. By design, its layers perform a continuous, scale-aware spatial operation, enabling generalization to unseen scales in a single forward pass without explicit scale sampling, as formalized by $\mathcal{R}_j(L_a(f)) = L_a(\mathcal{R}_j(f))$ and related constructions. Empirically, the method demonstrates strong, robust performance on crack segmentation in CT images of concrete and on the MNIST Large Scale dataset, with an emphasis on multiscale and real-world data, while using relatively few parameters (~$18\,k$). The work highlights that a principled, scale-equivariant approach can outperform traditional multiscale CNNs and scale-channel methods, reducing data requirements and providing a path toward 3D extensions and rotation-invariant variants for broader applicability.

Abstract

Scale invariance of an algorithm refers to its ability to treat objects equally independently of their size. For neural networks, scale invariance is typically achieved by data augmentation. However, when presented with a scale far outside the range covered by the training set, neural networks may fail to generalize. Here, we introduce the Riesz network, a novel scale invariant neural network. Instead of standard 2d or 3d convolutions for combining spatial information, the Riesz network is based on the Riesz transform which is a scale equivariant operation. As a consequence, this network naturally generalizes to unseen or even arbitrary scales in a single forward pass. As an application example, we consider detecting and segmenting cracks in tomographic images of concrete. In this context, 'scale' refers to the crack thickness which may vary strongly even within the same sample. To prove its scale invariance, the Riesz network is trained on one fixed crack width. We then validate its performance in segmenting simulated and real tomographic images featuring a wide range of crack widths. An additional experiment is carried out on the MNIST Large Scale data set.

Riesz networks: scale invariant neural networks in a single forward pass

TL;DR

The paper addresses the challenge of scale variation in image analysis by introducing the Riesz network, a scale-equivariant neural network built on the Riesz transform. By design, its layers perform a continuous, scale-aware spatial operation, enabling generalization to unseen scales in a single forward pass without explicit scale sampling, as formalized by and related constructions. Empirically, the method demonstrates strong, robust performance on crack segmentation in CT images of concrete and on the MNIST Large Scale dataset, with an emphasis on multiscale and real-world data, while using relatively few parameters (~). The work highlights that a principled, scale-equivariant approach can outperform traditional multiscale CNNs and scale-channel methods, reducing data requirements and providing a path toward 3D extensions and rotation-invariant variants for broader applicability.

Abstract

Scale invariance of an algorithm refers to its ability to treat objects equally independently of their size. For neural networks, scale invariance is typically achieved by data augmentation. However, when presented with a scale far outside the range covered by the training set, neural networks may fail to generalize. Here, we introduce the Riesz network, a novel scale invariant neural network. Instead of standard 2d or 3d convolutions for combining spatial information, the Riesz network is based on the Riesz transform which is a scale equivariant operation. As a consequence, this network naturally generalizes to unseen or even arbitrary scales in a single forward pass. As an application example, we consider detecting and segmenting cracks in tomographic images of concrete. In this context, 'scale' refers to the crack thickness which may vary strongly even within the same sample. To prove its scale invariance, the Riesz network is trained on one fixed crack width. We then validate its performance in segmenting simulated and real tomographic images featuring a wide range of crack widths. An additional experiment is carried out on the MNIST Large Scale data set.
Paper Structure (30 sections, 1 theorem, 15 equations, 20 figures, 4 tables)

This paper contains 30 sections, 1 theorem, 15 equations, 20 figures, 4 tables.

Key Result

Lemma 1

The Riesz transform is scale equivariant, i.e. for $f \in L_2(\mathbb{R}^d)$.

Figures (20)

  • Figure 1: Examples of similar objects appearing on different scales: section of a CT image of concrete showing a crack of locally varying thickness (left) and pedestrians at difference distances from the camera (right, taken from ess08).
  • Figure 2: Visualizations of Riesz transform kernels of first and second order. First row (from left to right): $\mathcal{R}_1$ and $\mathcal{R}_2$. Second row (from left to right): $\mathcal{R}^{(2,0)}$, $\mathcal{R}^{(1,1)}$, and $\mathcal{R}^{(0,2)}$.
  • Figure 3: Illustration of the Riesz transform on a mock example of $550\times 550$ pixels: aligned rectangles with equal aspect ratio and constant gray value $255$ (left) and response of the second order Riesz transform $\mathcal{R}^{(2,0)}$ of the left image sampled horizontally through the centers of the rectangles (right).
  • Figure 4: Building blocks of Riesz networks: the base Riesz layer from equation (\ref{['base:layer']}) (left) and the full Riesz layer from equation (\ref{['full:layer']}) (right).
  • Figure 5: Cracks of width 3 used for training: before (first row) and after cropping (second row). Image sizes are $256\times 256$ (first row) and $64 \times 64$ (second row).
  • ...and 15 more figures

Theorems & Definitions (3)

  • Lemma 1
  • proof
  • proof