PDE-CNNs: Axiomatic Derivations and Applications

Gijs Bellaard; Sei Sakata; Bart M. N. Smets; Remco Duits

PDE-CNNs: Axiomatic Derivations and Applications

Gijs Bellaard, Sei Sakata, Bart M. N. Smets, Remco Duits

TL;DR

This article focuses on Euclidean equivariant PDE-G-CNNs where the feature maps are two-dimensional throughout and reveals new PDEs that can be used in PDE-CNNs and experimentally examines what impact these have on the accuracy of PDE-CNNs.

Abstract

PDE-based Group Convolutional Neural Networks (PDE-G-CNNs) use solvers of evolution PDEs as substitutes for the conventional components in G-CNNs. PDE-G-CNNs can offer several benefits simultaneously: fewer parameters, inherent equivariance, better accuracy, and data efficiency. In this article we focus on Euclidean equivariant PDE-G-CNNs where the feature maps are two-dimensional throughout. We call this variant of the framework a PDE-CNN. From a machine learning perspective, we list several practically desirable axioms and derive from these which PDEs should be used in a PDE-CNN, this being our main contribution. Our approach to geometric learning via PDEs is inspired by the axioms of scale-space theory, which we generalize by introducing semifield-valued signals. Our theory reveals new PDEs that can be used in PDE-CNNs and we experimentally examine what impact these have on the accuracy of PDE-CNNs. We also confirm for small networks that PDE-CNNs offer fewer parameters, increased accuracy, and better data efficiency when compared to CNNs.

PDE-CNNs: Axiomatic Derivations and Applications

TL;DR

Abstract

Paper Structure (26 sections, 15 theorems, 91 equations, 6 figures)

This paper contains 26 sections, 15 theorems, 91 equations, 6 figures.

Introduction
Contributions
Short Outline
Background
Scale-Spaces
Semifields & Quasilinearity
Related Work
Semifield Theory
Semifield, Semimodules & Linearity
Functions, Measurability & Integration
Fourier Transform
Semifield Scale-space
Axioms
Examples
Isomorphic Scale-spaces
...and 11 more sections

Key Result

Proposition 1

Figures (6)

Figure 1: Diagram of an example CNN layer and PDE layer. The vertical direction represents the channels. The arrows represent the "flow" of the feature maps through the parts that make up a layer. In machine learning terms, the affine transformation block is equivalent to a 2D convolution module with bias and 1x1 kernels. PDE based networks replace the usual components that make up a CNN layer, that being convolutions, max pooling, and non-linear activation functions, by solvers of evolution PDEs. The PDEs here are convection, diffusion, dilation, and erosion \ref{['eq:pdes_pde_g_cnn']}. With "solvers" we mean the mapping from the initial condition $f|_{t=0}$ to $f|_{t=T}$. We can take $T=1$ without loss of generality due to the scale-equivariance property of the PDEs (Axiom \ref{['ax:r2_scaling']}).
Figure 2: The Gaussian \ref{['eq:intro_gaussian_scale_space_convolution']}, quadratic ($\alpha=2$) dilation \ref{['eq:intro_dilation_scale_space_convolution']}, and quadratic erosion \ref{['eq:intro_erosion_scale_space_convolution']} scale-space representations of a grayscale image of the fundus of the eye at various scale-parameters. In the Gaussian scale-space both white and black features fade away towards a uniform image. In the dilation scale-space the black details (low values), such as the vessels, vanish at bigger scales. In the erosion scale-space the white details (high values), such as the space between vessels, are removed at higher scales.
Figure 3: The architecture of a $N$-layer PDE-CNN with $C$ channels. A PDE sublayer is either of the form \ref{['eq:pde_sublayer']} or \ref{['eq:convection_pde_sublayer']}.
Figure 4: One instance of an input and its corresponding target segmentation from the DRIVE dataset, together with two example outputs of networks with different dice coefficients.
Figure 5: A scatterplot of the accuracy of a 6-layer PDE-CNN on the DRIVE dataset, with various designs of the PDE layer as indicated in the table on the left. The crosses indicate the mean. The rows are organized according to the amount of semifields included in the model.
...and 1 more figures

Theorems & Definitions (61)

Definition 1: Semifield
Definition 2: Semifields of Interest
Definition 3: Semifield Isomorphism
Proposition 1: Some Semifields Isomorphism
Definition 4: Semifield Metric
Definition 5: Employed Semifield Metrics
Definition 6: One-Dimensional Semifield
Definition 7: Semimodule
Definition 8: Semifield Linear
Definition 9: Function Semimodule
...and 51 more

PDE-CNNs: Axiomatic Derivations and Applications

TL;DR

Abstract

PDE-CNNs: Axiomatic Derivations and Applications

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (61)