Table of Contents
Fetching ...

Addressing Spectral Bias of Deep Neural Networks by Multi-Grade Deep Learning

Ronglong Fang, Yuesheng Xu

TL;DR

This paper proposes to learn a function containing high-frequency components by composing several SNNs, each of which learns certain low-frequency information from the given data, and reveals that MGDL excels at representing functions containing high-frequency information.

Abstract

Deep neural networks (DNNs) suffer from the spectral bias, wherein DNNs typically exhibit a tendency to prioritize the learning of lower-frequency components of a function, struggling to capture its high-frequency features. This paper is to address this issue. Notice that a function having only low frequency components may be well-represented by a shallow neural network (SNN), a network having only a few layers. By observing that composition of low frequency functions can effectively approximate a high-frequency function, we propose to learn a function containing high-frequency components by composing several SNNs, each of which learns certain low-frequency information from the given data. We implement the proposed idea by exploiting the multi-grade deep learning (MGDL) model, a recently introduced model that trains a DNN incrementally, grade by grade, a current grade learning from the residue of the previous grade only an SNN composed with the SNNs trained in the preceding grades as features. We apply MGDL to synthetic, manifold, colored images, and MNIST datasets, all characterized by presence of high-frequency features. Our study reveals that MGDL excels at representing functions containing high-frequency information. Specifically, the neural networks learned in each grade adeptly capture some low-frequency information, allowing their compositions with SNNs learned in the previous grades effectively representing the high-frequency features. Our experimental results underscore the efficacy of MGDL in addressing the spectral bias inherent in DNNs. By leveraging MGDL, we offer insights into overcoming spectral bias limitation of DNNs, thereby enhancing the performance and applicability of deep learning models in tasks requiring the representation of high-frequency information. This study confirms that the proposed method offers a promising solution to address the spectral bias of DNNs.

Addressing Spectral Bias of Deep Neural Networks by Multi-Grade Deep Learning

TL;DR

This paper proposes to learn a function containing high-frequency components by composing several SNNs, each of which learns certain low-frequency information from the given data, and reveals that MGDL excels at representing functions containing high-frequency information.

Abstract

Deep neural networks (DNNs) suffer from the spectral bias, wherein DNNs typically exhibit a tendency to prioritize the learning of lower-frequency components of a function, struggling to capture its high-frequency features. This paper is to address this issue. Notice that a function having only low frequency components may be well-represented by a shallow neural network (SNN), a network having only a few layers. By observing that composition of low frequency functions can effectively approximate a high-frequency function, we propose to learn a function containing high-frequency components by composing several SNNs, each of which learns certain low-frequency information from the given data. We implement the proposed idea by exploiting the multi-grade deep learning (MGDL) model, a recently introduced model that trains a DNN incrementally, grade by grade, a current grade learning from the residue of the previous grade only an SNN composed with the SNNs trained in the preceding grades as features. We apply MGDL to synthetic, manifold, colored images, and MNIST datasets, all characterized by presence of high-frequency features. Our study reveals that MGDL excels at representing functions containing high-frequency information. Specifically, the neural networks learned in each grade adeptly capture some low-frequency information, allowing their compositions with SNNs learned in the previous grades effectively representing the high-frequency features. Our experimental results underscore the efficacy of MGDL in addressing the spectral bias inherent in DNNs. By leveraging MGDL, we offer insights into overcoming spectral bias limitation of DNNs, thereby enhancing the performance and applicability of deep learning models in tasks requiring the representation of high-frequency information. This study confirms that the proposed method offers a promising solution to address the spectral bias of DNNs.

Paper Structure

This paper contains 10 sections, 1 theorem, 23 equations, 14 figures, 4 tables.

Key Result

Theorem 1

Let $\mathbb{D}$ be a compact subset of $\mathbb{R}^s$ and $L_2(\mathbb{D}, \mathbb{R}^t)$ denote the space of $t$-dimensional vector-valued square integral functions on $\mathbb{D}$. If $\mathbf{f} \in L_2(\mathbb{D}, \mathbb{R}^t)$, then for all $i=1, 2, \ldots$, where $\mathcal{N}^*_{l}$ is the SNN learned in grade $l$ of MGDL, and for $i=1,2, \ldots$, either $\mathbf{f}_{i+1} = \mathbf{0}$ or

Figures (14)

  • Figure 1: Amplitude versus one-side frequency plot for the learned functions learned across four grades of MGDL for settings 1-4.
  • Figure 2: Comparison of SGDL (left) and MGDL (right): training and validation loss across settings 1-4.
  • Figure 3: Amplitude vs. one-side frequency plot for the learned function across four grades of MGDL: settings 1 (left) and 2 (right) with $q=0$.
  • Figure 4: Comparison of the training and validation loss for SGDL (1st and 3rd subfigures) and MGDL (2nd and 4th subfigures) in settings 1 (1st and 2nd subfigures) and 2 (3rd and 4th subfigures).
  • Figure 5: Comparison of PSNR values for SGDL and MGDL on images Cat, Sea, and Building: SGDL (a)-(c), MGDL (d)-(f).
  • ...and 9 more figures

Theorems & Definitions (1)

  • Theorem 1