Table of Contents
Fetching ...

Efficient Convexification of Kolmogorov-Arnold Networks with Polynomial Functional Forms Via a Continuous Graham Scan Approach

Tianwei Li, Daniel Ovalle, Barnabas Poczos, Carl Laird, Ignacio Grossmann, Javier Pena

Abstract

Deterministic global optimization of nonlinear models is important in many scientific and engineering applications. This framework typically involves repeatedly solving convex relaxations of the nonconvex problem, meaning that the strength of the relaxations and the cost of computing them directly determine overall efficiency and solution quality. In this work, we develop a tailored continuous convexification framework for Kolmogorov-Arnold Networks in which the univariate components are polynomial functions. By exploiting the additive separable structure of this architecture, the relaxation problem reduces to computing tight convex envelopes of univariate polynomials. We propose a continuous variant of the classical Graham Scan that constructs these envelopes exactly by identifying the bitangents of the polynomial convex hull without discretization or factorable reformulations. We establish the correctness of the algorithm and characterize its computational complexity, and show how these envelopes can be combined to construct strong convex relaxations for polynomial KANs. Computational results demonstrate that the proposed relaxations are both strong and robust, often producing bounds that are comparable, or even orders of magnitude tighter than relaxations of state-of-the-art global optimization solvers while remaining computationally efficient.

Efficient Convexification of Kolmogorov-Arnold Networks with Polynomial Functional Forms Via a Continuous Graham Scan Approach

Abstract

Deterministic global optimization of nonlinear models is important in many scientific and engineering applications. This framework typically involves repeatedly solving convex relaxations of the nonconvex problem, meaning that the strength of the relaxations and the cost of computing them directly determine overall efficiency and solution quality. In this work, we develop a tailored continuous convexification framework for Kolmogorov-Arnold Networks in which the univariate components are polynomial functions. By exploiting the additive separable structure of this architecture, the relaxation problem reduces to computing tight convex envelopes of univariate polynomials. We propose a continuous variant of the classical Graham Scan that constructs these envelopes exactly by identifying the bitangents of the polynomial convex hull without discretization or factorable reformulations. We establish the correctness of the algorithm and characterize its computational complexity, and show how these envelopes can be combined to construct strong convex relaxations for polynomial KANs. Computational results demonstrate that the proposed relaxations are both strong and robust, often producing bounds that are comparable, or even orders of magnitude tighter than relaxations of state-of-the-art global optimization solvers while remaining computationally efficient.

Paper Structure

This paper contains 14 sections, 11 theorems, 62 equations, 5 figures, 5 algorithms.

Key Result

Lemma 2

Fix any univariate polynomial $p: \mathbb{R} \to \mathbb{R}$ of degree $n$ and any compact interval $\mathcal{I} :=[x_L,x_U]\subset \mathbb{R}$. Then the set is a finite union of $O(n)$ mutually disjoint closed intervals. $\blacktriangleleft$$\blacktriangleleft$

Figures (5)

  • Figure 1: Classical Graham Scan Algorithm (\ref{['alg:og-graham']}) for the discrete convex hull. Although originally designed for computing the convex hull of 2D points, here we only show how it computes the "bottom half" of the convex hull, to emphasize its relation to our continuous approach for function convex envelopes.
  • Figure 2: Our Continuous Graham Scan Algorithm (\ref{['alg:simp-graham']}) for the convex envelope of univariate functions. Notice that its mechanism is quite comparable to that of the discrete graham scan algorithm.
  • Figure 3: Illustration of using Convex Conjugate to binary search for the correct slope of bitangents: we use $p(x) := (x-2)^4 - 2(x-2)^2 - 0.5(x-2)$ restricted to the compact interval $[0.25, 3.75]$. In this case, its Convex Intervals are approximately $\{I_1, I_2\} := \{[0.25, 1.423], [2.577, 3.75]\}$ and $p|_{I_1}^*(s_0) = p|_{I_2}^*(s_0) = 0.0$ for $s_0=-0.5$. Thus $\texttt{\tt bitan}_p(I_1, I_2)(x) = -0.5x$. On the other hand, for $s = -0.25 > -0.5 = s_0$ we have $-p|_{I_1}^*(s) \approx -0.254 > -0.754\approx -p|_{I_2}^*(s).$
  • Figure 4: Performance of convex relaxations for PKAN models with six input variables and polynomial degree six as a function of the number of hidden units per layer, with panels grouped by the number of hidden layers. Bars report sample means across 50 randomly generated instances and error bars denote 95% confidence intervals; both quantities are shown on logarithmic scales. Integer labels above bars indicate the number of instances (out of 50) for which the corresponding solver failed to produce a nontrivial root-node relaxation. When failures occur, averages are computed only over successful instances. Absence of a number indicates that all instances produced valid bounds.
  • Figure 5: Effect of polynomial degree and input dimensionality on the quality and computational cost of convex relaxations for PKAN models with six hidden layers and six hidden units per layer. Bars represent averages over 50 randomly generated instances and error bars correspond to 95% confidence intervals computed from the sample standard error. Both metrics are displayed on logarithmic scales. Integer labels above bars indicate the number of instances (out of 50) for which the solver failed to produce a valid root-node relaxation. When such failures occur, the reported statistics are computed only over successful instances. Absence of a label shows that all instances produced nontrivial bounds.

Theorems & Definitions (29)

  • Remark 1
  • Lemma 2
  • proof
  • Definition 1
  • Theorem 3
  • proof
  • Definition 2
  • Lemma 4
  • proof
  • Corollary 5
  • ...and 19 more