Table of Contents
Fetching ...

Hybrid deep additive neural networks

Gyu Min Kim, Jeong Min Jeon

TL;DR

This work introduces Deep Additive Neural Networks (DANN) and three hybrid variants (HDANN1–HDANN3) that integrate additive regression concepts with deep learning. By employing nonlinear basis expansions for inner nodes and fixed activation functions, the authors establish universal approximation properties for these architectures and demonstrate practical gains over traditional deep neural networks with substantially fewer parameters. Theoretical results (including multiple universal approximation theorems) are complemented by simulations and a California Housing data application, where the HDANNs often achieve lower errors with dramatically fewer parameters. The approach offers a scalable, easy-to-implement alternative to standard deep networks with potential extensions to convolutional and classification settings.

Abstract

Traditional neural networks (multi-layer perceptrons) have become an important tool in data science due to their success across a wide range of tasks. However, their performance is sometimes unsatisfactory, and they often require a large number of parameters, primarily due to their reliance on the linear combination structure. Meanwhile, additive regression has been a popular alternative to linear regression in statistics. In this work, we introduce novel deep neural networks that incorporate the idea of additive regression. Our neural networks share architectural similarities with Kolmogorov-Arnold networks but are based on simpler yet flexible activation and basis functions. Additionally, we introduce several hybrid neural networks that combine this architecture with that of traditional neural networks. We derive their universal approximation properties and demonstrate their effectiveness through simulation studies and a real-data application. The numerical results indicate that our neural networks generally achieve better performance than traditional neural networks while using fewer parameters.

Hybrid deep additive neural networks

TL;DR

This work introduces Deep Additive Neural Networks (DANN) and three hybrid variants (HDANN1–HDANN3) that integrate additive regression concepts with deep learning. By employing nonlinear basis expansions for inner nodes and fixed activation functions, the authors establish universal approximation properties for these architectures and demonstrate practical gains over traditional deep neural networks with substantially fewer parameters. Theoretical results (including multiple universal approximation theorems) are complemented by simulations and a California Housing data application, where the HDANNs often achieve lower errors with dramatically fewer parameters. The approach offers a scalable, easy-to-implement alternative to standard deep networks with potential extensions to convolutional and classification settings.

Abstract

Traditional neural networks (multi-layer perceptrons) have become an important tool in data science due to their success across a wide range of tasks. However, their performance is sometimes unsatisfactory, and they often require a large number of parameters, primarily due to their reliance on the linear combination structure. Meanwhile, additive regression has been a popular alternative to linear regression in statistics. In this work, we introduce novel deep neural networks that incorporate the idea of additive regression. Our neural networks share architectural similarities with Kolmogorov-Arnold networks but are based on simpler yet flexible activation and basis functions. Additionally, we introduce several hybrid neural networks that combine this architecture with that of traditional neural networks. We derive their universal approximation properties and demonstrate their effectiveness through simulation studies and a real-data application. The numerical results indicate that our neural networks generally achieve better performance than traditional neural networks while using fewer parameters.

Paper Structure

This paper contains 14 sections, 4 theorems, 47 equations, 11 figures, 6 tables.

Key Result

Lemma 1

There exists a set $\{B_r:[0,1]\rightarrow\mathbb{R}\,|\,r\geq1\}$ of known functions such that, for any given continuous function $\phi:[0,1] \rightarrow \mathbb{R}$ and constant $\epsilon>0$, there exist $q\geq1$ and $c_{r}, c \in \mathbb{R}$ such that

Figures (11)

  • Figure 1: Architectures of (\ref{['eqn:1-FCN']}) (left) and (\ref{['eqn:O_ANN']}) (right). In the left panel, $W_k$ denotes $(w_{1k},\ldots,w_{dk})^{\mathstrut{\top}}\in\mathbb{R}^d$.
  • Figure 2: Architecture of ANN model.
  • Figure 3: Architecture of ANN model in the case of the same set and same number of basis functions.
  • Figure 4: Architecture of DANN model.
  • Figure 5: Architecture of HDANN1 model.
  • ...and 6 more figures

Theorems & Definitions (5)

  • Lemma 1
  • Theorem 1
  • Remark 1
  • Theorem 2
  • Theorem 3