APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation
Ravin Kumar
TL;DR
The paper addresses inefficiencies in conventional neural networks stemming from the separation of linear transformation and fixed activation functions. It introduces the APTx Neuron, a unified, trainable unit that embeds per-input non-linear modulation and a bias term within a single expression, extending the APTx activation to a full neuron. The authors demonstrate the approach on MNIST with a compact 332k-parameter network that achieves 96.69% test accuracy in 11 epochs, highlighting fast convergence and expressive power. They also outline pathways for integrating APTx Neurons into CNNs and Transformers, suggesting a new, flexible paradigm for building compact and adaptive deep architectures.
Abstract
We propose the APTx Neuron, a novel, unified neural computation unit that integrates non-linear activation and linear transformation into a single trainable expression. The APTx Neuron is derived from the APTx activation function, thereby eliminating the need for separate activation layers and making the architecture both optimization-efficient and elegant. The proposed neuron follows the functional form $y = \sum_{i=1}^{n} ((α_i + \tanh(β_i x_i)) \cdot γ_i x_i) + δ$, where all parameters $α_i$, $β_i$, $γ_i$, and $δ$ are trainable. We validate our APTx Neuron-based architecture on the MNIST dataset, achieving up to $96.69\%$ test accuracy within 11 epochs using approximately 332K trainable parameters. The results highlight the superior expressiveness and training efficiency of the APTx Neuron compared to traditional neurons, pointing toward a new paradigm in unified neuron design and the architectures built upon it. Source code is available at https://github.com/mr-ravin/aptx_neuron.
