Table of Contents
Fetching ...

A unified Fourier slice method to derive ridgelet transform for a variety of depth-2 neural networks

Sho Sonoda, Isao Ishikawa, Masahiro Ikeda

TL;DR

This work addresses the challenge of interpreting neural network parameters by viewing them through ridgelet transforms and parameter distributions. It proposes a Fourier slice method to systematically derive ridgelet transforms for diverse architectures, including finite-field networks, group-convolutional nets on Hilbert spaces, fully-connected nets on noncompact symmetric spaces, and pooling/d-plane setups, accompanied by constructive reconstruction formulas. The core contributions are a three-step Fourier-based procedure to obtain ridgelet coefficients, universal-reconstruction guarantees across cases, and connections to geometric deep learning and Radon/wavelet theory. Together, these developments offer a harmonic-analysis perspective on neural networks with potential implications for parameter-distribution analysis and representer theorems.

Abstract

To investigate neural network parameters, it is easier to study the distribution of parameters than to study the parameters in each neuron. The ridgelet transform is a pseudo-inverse operator that maps a given function $f$ to the parameter distribution $γ$ so that a network $\mathtt{NN}[γ]$ reproduces $f$, i.e. $\mathtt{NN}[γ]=f$. For depth-2 fully-connected networks on a Euclidean space, the ridgelet transform has been discovered up to the closed-form expression, thus we could describe how the parameters are distributed. However, for a variety of modern neural network architectures, the closed-form expression has not been known. In this paper, we explain a systematic method using Fourier expressions to derive ridgelet transforms for a variety of modern networks such as networks on finite fields $\mathbb{F}_p$, group convolutional networks on abstract Hilbert space $\mathcal{H}$, fully-connected networks on noncompact symmetric spaces $G/K$, and pooling layers, or the $d$-plane ridgelet transform.

A unified Fourier slice method to derive ridgelet transform for a variety of depth-2 neural networks

TL;DR

This work addresses the challenge of interpreting neural network parameters by viewing them through ridgelet transforms and parameter distributions. It proposes a Fourier slice method to systematically derive ridgelet transforms for diverse architectures, including finite-field networks, group-convolutional nets on Hilbert spaces, fully-connected nets on noncompact symmetric spaces, and pooling/d-plane setups, accompanied by constructive reconstruction formulas. The core contributions are a three-step Fourier-based procedure to obtain ridgelet coefficients, universal-reconstruction guarantees across cases, and connections to geometric deep learning and Radon/wavelet theory. Together, these developments offer a harmonic-analysis perspective on neural networks with potential implications for parameter-distribution analysis and representer theorems.

Abstract

To investigate neural network parameters, it is easier to study the distribution of parameters than to study the parameters in each neuron. The ridgelet transform is a pseudo-inverse operator that maps a given function to the parameter distribution so that a network reproduces , i.e. . For depth-2 fully-connected networks on a Euclidean space, the ridgelet transform has been discovered up to the closed-form expression, thus we could describe how the parameters are distributed. However, for a variety of modern neural network architectures, the closed-form expression has not been known. In this paper, we explain a systematic method using Fourier expressions to derive ridgelet transforms for a variety of modern networks such as networks on finite fields , group convolutional networks on abstract Hilbert space , fully-connected networks on noncompact symmetric spaces , and pooling layers, or the -plane ridgelet transform.
Paper Structure (42 sections, 19 theorems, 133 equations, 2 figures, 1 table)

This paper contains 42 sections, 19 theorems, 133 equations, 2 figures, 1 table.

Key Result

Theorem 1.1

Suppose $\sigma$ and $\rho$ are a tempered distribution ($\mathcal{S}'$) on $\mathbb{R}$ and a rapidly decreasing function ($\mathcal{S}$) on $\mathbb{R}$, respectively. Then, for any square integrable function $f$, the following reconstruction formula holds with the factor being a scalar product of $\sigma$ and $\rho$, where $\sharp$ denotes the Fourier transform.

Figures (2)

  • Figure 1: Poincare Disk $\mathbb{B}^2$ is a noncompact symmetric space $SU(1,1)/SO(2)$. Poincaré disk $\mathbb{B}^2$, boundary $\partial\mathbb{B}^2$, point ${\bm{x}}$ (magenta), horocycle $\xi({\bm{y}},{\bm{u}})$ (magenta) through point ${\bm{y}}$ tangent to the boundary at ${\bm{u}}$, and two geodesics (solid black) orthogonal to the boundary at ${\bm{u}}$ through ${\bm{o}}$ and ${\bm{x}}$ respectively. The signed composite distance $\langle{\bm{y}},{\bm{u}}\rangle$ from the origin ${\bm{o}}$ to the horocycle $\xi({\bm{y}},{\bm{u}})$ can be visualized as the Riemannian distance from ${\bm{o}}$ to point ${\bm{y}}_0$. Similarly, the distance between point ${\bm{x}}$ and horocycle $\xi({\bm{y}},{\bm{u}})$ is understood as the Riemannian distance between ${\bm{x}}$ and ${\bm{y}}_x$ along the geodesic, or equivalently, ${\bm{x}}_0$ and ${\bm{y}}_0$. See Appendix \ref{['sec:poincare']} for more details.
  • Figure 2: The Euclidean fully-connected layer $\sigma({\bm{a}}\cdot{\bm{x}}-b)$ is recast as the signed distance $d({\bm{x}},\xi)$ from a point ${\bm{x}}$ to a hyperplane $\xi({\bm{y}},{\bm{u}})$ followed by nonlinearity $\sigma(r\bullet)$, where ${\bm{y}}$ satisfies $r{\bm{y}}\cdot{\bm{u}}=b$ and $\xi({\bm{y}},{\bm{u}})$ passes through the point ${\bm{y}}$ with normal ${\bm{u}}$.

Theorems & Definitions (50)

  • Definition 1.1
  • Definition 1.2
  • Theorem 1.1: Reconstruction Formula
  • Definition 3.1: Fourier Transform on $\mathbb{Z}_n^m$
  • Theorem 3.1: Inversion Formula
  • Definition 3.2: NN on $\mathbb{F}_p^m$
  • Theorem 3.2: Reconstruction Formula
  • proof
  • Definition 4.1: Fourier Transform on a Hilbert Space $\mathcal{H}_m \subset \mathcal{H}$
  • Theorem 4.1
  • ...and 40 more