Table of Contents
Fetching ...

APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation

Ravin Kumar

TL;DR

The paper addresses inefficiencies in conventional neural networks stemming from the separation of linear transformation and fixed activation functions. It introduces the APTx Neuron, a unified, trainable unit that embeds per-input non-linear modulation and a bias term within a single expression, extending the APTx activation to a full neuron. The authors demonstrate the approach on MNIST with a compact 332k-parameter network that achieves 96.69% test accuracy in 11 epochs, highlighting fast convergence and expressive power. They also outline pathways for integrating APTx Neurons into CNNs and Transformers, suggesting a new, flexible paradigm for building compact and adaptive deep architectures.

Abstract

We propose the APTx Neuron, a novel, unified neural computation unit that integrates non-linear activation and linear transformation into a single trainable expression. The APTx Neuron is derived from the APTx activation function, thereby eliminating the need for separate activation layers and making the architecture both optimization-efficient and elegant. The proposed neuron follows the functional form $y = \sum_{i=1}^{n} ((α_i + \tanh(β_i x_i)) \cdot γ_i x_i) + δ$, where all parameters $α_i$, $β_i$, $γ_i$, and $δ$ are trainable. We validate our APTx Neuron-based architecture on the MNIST dataset, achieving up to $96.69\%$ test accuracy within 11 epochs using approximately 332K trainable parameters. The results highlight the superior expressiveness and training efficiency of the APTx Neuron compared to traditional neurons, pointing toward a new paradigm in unified neuron design and the architectures built upon it. Source code is available at https://github.com/mr-ravin/aptx_neuron.

APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation

TL;DR

The paper addresses inefficiencies in conventional neural networks stemming from the separation of linear transformation and fixed activation functions. It introduces the APTx Neuron, a unified, trainable unit that embeds per-input non-linear modulation and a bias term within a single expression, extending the APTx activation to a full neuron. The authors demonstrate the approach on MNIST with a compact 332k-parameter network that achieves 96.69% test accuracy in 11 epochs, highlighting fast convergence and expressive power. They also outline pathways for integrating APTx Neurons into CNNs and Transformers, suggesting a new, flexible paradigm for building compact and adaptive deep architectures.

Abstract

We propose the APTx Neuron, a novel, unified neural computation unit that integrates non-linear activation and linear transformation into a single trainable expression. The APTx Neuron is derived from the APTx activation function, thereby eliminating the need for separate activation layers and making the architecture both optimization-efficient and elegant. The proposed neuron follows the functional form , where all parameters , , , and are trainable. We validate our APTx Neuron-based architecture on the MNIST dataset, achieving up to test accuracy within 11 epochs using approximately 332K trainable parameters. The results highlight the superior expressiveness and training efficiency of the APTx Neuron compared to traditional neurons, pointing toward a new paradigm in unified neuron design and the architectures built upon it. Source code is available at https://github.com/mr-ravin/aptx_neuron.

Paper Structure

This paper contains 20 sections, 12 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Visual analysis of train and test loss values.
  • Figure 2: Visual analysis of train and test accuracy values.