Kolmogorov-Arnold Network for Online Reinforcement Learning
Victor Augusto Kich, Jair Augusto Bottega, Raul Steinmetz, Ricardo Bedin Grando, Ayano Yorozu, Akihisa Ohya
TL;DR
Kolmogorov-Arnold Networks (KANs) are introduced as memory-efficient, universal function approximators for online reinforcement learning by leveraging the Kolmogorov-Arnold representation theorem. The authors integrate KANs as both policy and value function approximators within Proximal Policy Optimization (PPO) and benchmark against MLP-based PPO on the DeepMind Control Proprio robotics suite, finding comparable performance with substantially fewer parameters. Key contributions include the first application of KANs in online RL, a systematic comparison to MLP PPO across six continuous-control tasks, and an analysis of parameter efficiency and speed trade-offs. The work suggests KANs as a promising option for resource-constrained RL in robotics, with code available at the project repository.
Abstract
Kolmogorov-Arnold Networks (KANs) have shown potential as an alternative to Multi-Layer Perceptrons (MLPs) in neural networks, providing universal function approximation with fewer parameters and reduced memory usage. In this paper, we explore the use of KANs as function approximators within the Proximal Policy Optimization (PPO) algorithm. We evaluate this approach by comparing its performance to the original MLP-based PPO using the DeepMind Control Proprio Robotics benchmark. Our results indicate that the KAN-based reinforcement learning algorithm can achieve comparable performance to its MLP-based counterpart, often with fewer parameters. These findings suggest that KANs may offer a more efficient option for reinforcement learning models.
