Table of Contents
Fetching ...

CauchyNet: Compact and Data-Efficient Learning using Holomorphic Activation Functions

Hong-Kun Zhang, Xin Li, Sikun Yang, Zhihong Xia

TL;DR

CauchyNet tackles data-scarce learning and edge-edge deployment by embedding real inputs into the complex plane and employing a holomorphic, inversion-based activation inspired by Cauchy’s integral formula. The model uses complex-valued parameters, a single hidden layer, and a real-valued output with an auxiliary imaginary penalty to stabilize learning, achieving a compact parameter footprint and robust performance on near-singular and missing-data tasks. Theoretical guarantees come from a Cauchy kernel-based universal approximation framework, while empirical results demonstrate strong accuracy and efficiency across 1D and 2D function approximation, missing-data imputation, and constrained forecasting, often outperforming larger baselines with far fewer parameters. This holomorphic-inductive bias enables stable gradient flow, efficient backpropagation via Wirtinger calculus, and practical applicability to resource-constrained predictive modeling and scientific computing.

Abstract

A novel neural network inspired by Cauchy's integral formula, is proposed for function approximation tasks that include time series forecasting, missing data imputation, etc. Hence, the novel neural network is named CauchyNet. By embedding real-valued data into the complex plane, CauchyNet efficiently captures complex temporal dependencies, surpassing traditional real-valued models in both predictive performance and computational efficiency. Grounded in Cauchy's integral formula and supported by the universal approximation theorem, CauchyNet offers strong theoretical guarantees for function approximation. The architecture incorporates complex-valued activation functions, enabling robust learning from incomplete data while maintaining a compact parameter footprint and reducing computational overhead. Through extensive experiments in diverse domains, including transportation, energy consumption, and epidemiological data, CauchyNet consistently outperforms state-of-the-art models in predictive accuracy, often achieving a 50% lower mean absolute error with fewer parameters. These findings highlight CauchyNet's potential as an effective and efficient tool for data-driven predictive modeling, particularly in resource-constrained and data-scarce environments.

CauchyNet: Compact and Data-Efficient Learning using Holomorphic Activation Functions

TL;DR

CauchyNet tackles data-scarce learning and edge-edge deployment by embedding real inputs into the complex plane and employing a holomorphic, inversion-based activation inspired by Cauchy’s integral formula. The model uses complex-valued parameters, a single hidden layer, and a real-valued output with an auxiliary imaginary penalty to stabilize learning, achieving a compact parameter footprint and robust performance on near-singular and missing-data tasks. Theoretical guarantees come from a Cauchy kernel-based universal approximation framework, while empirical results demonstrate strong accuracy and efficiency across 1D and 2D function approximation, missing-data imputation, and constrained forecasting, often outperforming larger baselines with far fewer parameters. This holomorphic-inductive bias enables stable gradient flow, efficient backpropagation via Wirtinger calculus, and practical applicability to resource-constrained predictive modeling and scientific computing.

Abstract

A novel neural network inspired by Cauchy's integral formula, is proposed for function approximation tasks that include time series forecasting, missing data imputation, etc. Hence, the novel neural network is named CauchyNet. By embedding real-valued data into the complex plane, CauchyNet efficiently captures complex temporal dependencies, surpassing traditional real-valued models in both predictive performance and computational efficiency. Grounded in Cauchy's integral formula and supported by the universal approximation theorem, CauchyNet offers strong theoretical guarantees for function approximation. The architecture incorporates complex-valued activation functions, enabling robust learning from incomplete data while maintaining a compact parameter footprint and reducing computational overhead. Through extensive experiments in diverse domains, including transportation, energy consumption, and epidemiological data, CauchyNet consistently outperforms state-of-the-art models in predictive accuracy, often achieving a 50% lower mean absolute error with fewer parameters. These findings highlight CauchyNet's potential as an effective and efficient tool for data-driven predictive modeling, particularly in resource-constrained and data-scarce environments.

Paper Structure

This paper contains 22 sections, 3 theorems, 49 equations, 14 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

(Cauchy Approximation Theorem). Let $f\in{\mathcal{F}}_{M}$ be continous on $M\subset\mathbb{R}^{N}$ with $M$ contained in an open domain $U\subset\mathbb{C}^{N}$ whose boundary is $\partial \bar{U}$. For any $\epsilon>0$, there exists $m$, points $\{\boldsymbol{\xi}_{1},...,\boldsymbol{\xi}_{m}\}\s

Figures (14)

  • Figure 1: (Left) Comparison of the true function (dashed black line), CauchyNet (orange), and ReLU MLP (blue) in approximating a rational spike. The plot highlights CauchyNet's ability to closely track steep gradients, while the ReLU MLP significantly underestimates the peak near $x = 0.5$. (Right) Training and validation loss trajectories (log scale) over 500 epochs. Shaded regions indicate standard deviation across 10 independent runs, showing that CauchyNet converges faster and achieves lower final loss compared to the ReLU MLP.
  • Figure 2: Visual representation of the Cauchy activation function $\mathscr{X}(\boldsymbol{z})$ for $z$ values outside a small circular disc. (Top row) 3D surface plots showing the real part, imaginary part, and magnitude of $\mathscr{X}(\boldsymbol{z})$. (Bottom row) Corresponding 2D contour plots for each component. The white region in the middle represents the excluded circular disc, demonstrating how the activation function transforms inputs in the complex plane.
  • Figure 3: Roadmap of the theoretical analysis for CauchyNet
  • Figure 4: Behavior of the one-dimensional complex Cauchy kernel $K(\boldsymbol{\xi}, x) = 1$, with $\boldsymbol{\xi}$ values distributed along an ellipse in the complex plane. The main plots show the real (top) and imaginary (bottom) parts of $K(\boldsymbol{\xi}, x)$ as a function of $x\in[5, 5]$. The inset illustrates the positions of $\boldsymbol{\xi}$ values on the ellipse, with semi-major and semi-minor axes of 6 and 2 units, respectively. This visualization highlights the kernel's ability to encode variations in $x$ while preserving holomorphic properties.
  • Figure 5: (a) Training and validation loss (log scale) over 200 epochs. CauchyNet converges faster and retains lower validation loss. (b) Box plot of absolute errors on the test set for various models. CauchyNet shows the smallest median error and least variability, excelling under data scarcity (MAE of 1.5 vs. 3.2 or much higher for baselines). (c) Box plot of absolute errors on the 1D test set for various models. (d) A table listing of error statistics of CauchyNet and other baseline models. CauchyNet achieves the smallest median error and the tightest error distribution, significantly outperforming baselines like SIREN, N-BEATS, and FNN in imputing missing values. (e) Training (blue) and test (red) points for the one-dimensional gap-filling task, with missing zones centered around turning points. (f) Predictions from CauchyNet (orange) closely match the true function (dashed line) within the missing regions, demonstrating accurate reconstruction.
  • ...and 9 more figures

Theorems & Definitions (5)

  • Definition 1
  • Definition 2
  • Theorem 1
  • Theorem 2
  • Theorem 3