Table of Contents
Fetching ...

Kolmogorov-Arnold Network for Online Reinforcement Learning

Victor Augusto Kich, Jair Augusto Bottega, Raul Steinmetz, Ricardo Bedin Grando, Ayano Yorozu, Akihisa Ohya

TL;DR

Kolmogorov-Arnold Networks (KANs) are introduced as memory-efficient, universal function approximators for online reinforcement learning by leveraging the Kolmogorov-Arnold representation theorem. The authors integrate KANs as both policy and value function approximators within Proximal Policy Optimization (PPO) and benchmark against MLP-based PPO on the DeepMind Control Proprio robotics suite, finding comparable performance with substantially fewer parameters. Key contributions include the first application of KANs in online RL, a systematic comparison to MLP PPO across six continuous-control tasks, and an analysis of parameter efficiency and speed trade-offs. The work suggests KANs as a promising option for resource-constrained RL in robotics, with code available at the project repository.

Abstract

Kolmogorov-Arnold Networks (KANs) have shown potential as an alternative to Multi-Layer Perceptrons (MLPs) in neural networks, providing universal function approximation with fewer parameters and reduced memory usage. In this paper, we explore the use of KANs as function approximators within the Proximal Policy Optimization (PPO) algorithm. We evaluate this approach by comparing its performance to the original MLP-based PPO using the DeepMind Control Proprio Robotics benchmark. Our results indicate that the KAN-based reinforcement learning algorithm can achieve comparable performance to its MLP-based counterpart, often with fewer parameters. These findings suggest that KANs may offer a more efficient option for reinforcement learning models.

Kolmogorov-Arnold Network for Online Reinforcement Learning

TL;DR

Kolmogorov-Arnold Networks (KANs) are introduced as memory-efficient, universal function approximators for online reinforcement learning by leveraging the Kolmogorov-Arnold representation theorem. The authors integrate KANs as both policy and value function approximators within Proximal Policy Optimization (PPO) and benchmark against MLP-based PPO on the DeepMind Control Proprio robotics suite, finding comparable performance with substantially fewer parameters. Key contributions include the first application of KANs in online RL, a systematic comparison to MLP PPO across six continuous-control tasks, and an analysis of parameter efficiency and speed trade-offs. The work suggests KANs as a promising option for resource-constrained RL in robotics, with code available at the project repository.

Abstract

Kolmogorov-Arnold Networks (KANs) have shown potential as an alternative to Multi-Layer Perceptrons (MLPs) in neural networks, providing universal function approximation with fewer parameters and reduced memory usage. In this paper, we explore the use of KANs as function approximators within the Proximal Policy Optimization (PPO) algorithm. We evaluate this approach by comparing its performance to the original MLP-based PPO using the DeepMind Control Proprio Robotics benchmark. Our results indicate that the KAN-based reinforcement learning algorithm can achieve comparable performance to its MLP-based counterpart, often with fewer parameters. These findings suggest that KANs may offer a more efficient option for reinforcement learning models.
Paper Structure (14 sections, 10 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 14 sections, 10 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: Overview of the proposed framework.
  • Figure 2: KAN architecture of the environment DMC Proprio Swimmer-v4.
  • Figure 3: Reward average comparison for all the trained environments using the proposed architectures (using 5 seeds).