Learning Expressive Random Feature Models via Parametrized Activations
Zailin Ma, Jiansheng Yang, Yaodong Yang
TL;DR
This work introduces the Random Feature Model with Learnable Activation Functions (RFLAF), which learns activations by expressing them as weighted sums of basis functions within the random feature framework to overcome fixed-activation rigidity. The authors provide theory for the single-RBF case and extend to multiple RBFs, including a Gaussian Universal Approximation Theorem and finite-width approximation bounds that quantify expressivity gains. They prove generalization and sample-complexity results showing how the number of random features $M$, grid size $N$, and sample size $n$ scale to achieve a desired accuracy, with practical guidance such as $N=\tilde{\Theta}(1/\epsilon)$ and $M=\tilde{\Theta}(1/\epsilon^2)$. Empirically, RFLAF variants with RBFs or splines outperform traditional RF models across several datasets, with RBFs offering notable computational efficiency, and unfreezing first-layer parameters (LAN) validating the expressivity advantages on two-layer networks. Overall, the paper deepens understanding of learnable activation modules within neural architectures and demonstrates tangible performance gains on kernel-approximation-based models.
Abstract
Random feature (RF) method is a powerful kernel approximation technique, but is typically equipped with fixed activation functions, limiting its adaptability across diverse tasks. To overcome this limitation, we introduce the Random Feature Model with Learnable Activation Functions (RFLAF), a novel statistical model that parameterizes activation functions as weighted sums of basis functions within the random feature framework. Examples of basis functions include radial basis functions, spline functions, polynomials, and so forth. For theoretical results, we consider RBFs as representative basis functions. We start with a single RBF as the activation, and then extend the results to multiple RBFs, demonstrating that RF models with learnable activation component largely expand the represented function space. We provide estimates on the required number of samples and random features to achieve low excess risks. For experiments, we test RFLAF with three types of bases: radial basis functions, spline functions and polynomials. Experimental results show that RFLAFs with RBFs and splines consistently outperform other RF models, where RBFs show 3 times faster computational efficiency than splines. We then unfreeze the first-layer parameters and retrain the models, validating the expressivity advantage of learnable activation components on regular two-layer neural networks. Our work provides a deeper understanding of the component of learnable activation functions within modern neural network architectures.
