Table of Contents
Fetching ...

Input Convex Kolmogorov Arnold Networks

Thomas Deschatre, Xavier Warin

TL;DR

This work introduces Input Convex Kolmogorov Arnold Networks (ICKAN) to approximate convex functions while preserving convexity, leveraging the Kolmogorov-Arnold representation. It develops two main variants, P1-ICKAN (piecewise-linear) and Cubic-ICKAN (Hermite cubic), with optional grid adaptation and universal approximation guarantees for the adapted case. The paper provides convergence theorems, analyzes layer construction, and demonstrates competitive performance against standard ICNNs on function-approximation and toy control tasks. It then extends to Partial ICKAN (PICKAN) for partially convex targets and applies ICKAN to optimal transport via a Brenier potential-based formulation, showing favorable results on synthetic data, especially in higher dimensions and tensorized settings. Overall, ICKANs offer a principled, interpretable alternative to ICNNs with potential practical benefits for separable or convex-structure problems in optimization and transport.

Abstract

This article presents an input convex neural network architecture using Kolmogorov-Arnold networks (ICKAN). Two specific networks are presented: the first is based on a low-order, linear-by-part, representation of functions, and a universal approximation theorem is provided. The second is based on cubic splines, for which only numerical results support convergence. We demonstrate on simple tests that these networks perform competitively with classical input convex neural networks (ICNNs). In a second part, we use the networks to solve some optimal transport problems needing a convex approximation of functions and demonstrate their effectiveness. Comparisons with ICNNs show that cubic ICKANs produce results similar to those of classical ICNNs.

Input Convex Kolmogorov Arnold Networks

TL;DR

This work introduces Input Convex Kolmogorov Arnold Networks (ICKAN) to approximate convex functions while preserving convexity, leveraging the Kolmogorov-Arnold representation. It develops two main variants, P1-ICKAN (piecewise-linear) and Cubic-ICKAN (Hermite cubic), with optional grid adaptation and universal approximation guarantees for the adapted case. The paper provides convergence theorems, analyzes layer construction, and demonstrates competitive performance against standard ICNNs on function-approximation and toy control tasks. It then extends to Partial ICKAN (PICKAN) for partially convex targets and applies ICKAN to optimal transport via a Brenier potential-based formulation, showing favorable results on synthetic data, especially in higher dimensions and tensorized settings. Overall, ICKANs offer a principled, interpretable alternative to ICNNs with potential practical benefits for separable or convex-structure problems in optimization and transport.

Abstract

This article presents an input convex neural network architecture using Kolmogorov-Arnold networks (ICKAN). Two specific networks are presented: the first is based on a low-order, linear-by-part, representation of functions, and a universal approximation theorem is provided. The second is based on cubic splines, for which only numerical results support convergence. We demonstrate on simple tests that these networks perform competitively with classical input convex neural networks (ICNNs). In a second part, we use the networks to solve some optimal transport problems needing a convex approximation of functions and demonstrate their effectiveness. Comparisons with ICNNs show that cubic ICKANs produce results similar to those of classical ICNNs.

Paper Structure

This paper contains 27 sections, 2 theorems, 62 equations, 21 figures, 7 tables, 1 algorithm.

Key Result

Theorem 2.1

The space spanned by the P1-ICKAN letting $n_l$ for $l=0, \ldots,L-1$ and $L$ vary for $P>1$ is dense in the set of Lipschitz convex functions on $[0,1]^n$ with the sup norm when adaptation is used.

Figures (21)

  • Figure 1: Uniform P1 basis functions on $[0,1]$ with $P=5$.
  • Figure 2: Estimation of the function \ref{['eq:wrong_Con']} in dimension 1 wrongly supposing that the function is convex, network with $P=10$.
  • Figure 5: Partial Input Convex Kolmogorov Arnold Network using a piecewise linear approximation.
  • Figure 6: Distribution of the true target distribution as well as the one obtained by linear transport or Cubic ICKAN transport with adapted mesh, $P=10$, and 64 and 32 neurons for the map in korotin2021 and $d=2$. The first two figures include the empirical histogram as well as a Gaussian kernel density estimator with bandwidth selected using Scott's rule scott2005 (solid line for target, dashed lines for linear and ICKAN transports).
  • Figure 7: Distribution of the true target distribution as well as the one obtained by linear transport or Cubic ICKAN transport with adapted mesh and $P=10$, for the map $T(x) = (T_i(x_i))_{i=1,2}$ with $T_i(x_i) = x_i + \frac{1}{6 - \cos(2\pi x_i)}- 0.2$, $i=1,2$. The first two figures include the empirical histogram as well as a Gaussian kernel density estimator with bandwidth selected using Scott's rule scott2005 (solid line for target, dashed lines for linear and ICKAN transports).
  • ...and 16 more figures

Theorems & Definitions (9)

  • remark 2.1
  • remark 2.2
  • Theorem 2.1
  • Theorem 2.2
  • remark 3.1
  • remark 3.2
  • remark 3.3
  • remark 3.4
  • remark B.1