Table of Contents
Fetching ...

RPN: Reconciled Polynomial Network Towards Unifying PGMs, Kernel SVMs, MLP and KAN

Jiawei Zhang

TL;DR

This work introduces Reconciled Polynomial Network (RPN), a general base model for deep function learning designed to unify probabilistic graphical models, kernel SVMs, MLP, and KAN under a single canonical representation. RPN decomposes the target mapping into the inner product of a data expansion κ( x ) and a parameter reconciliation ψ( w ), plus a remainder π( x ), inspired by Taylor's theorem, enabling interpretable, modular architectures that can be shallow or deeply stacked, wide with multiple heads/channels, or expanded with nested extensions. The paper provides a broad catalog of expansion, reconciliation, and remainder function templates implementable via the TinyBig toolkit, and demonstrates empirically that RPN achieves superior or competitive performance across continuous function fitting, discrete image/text classification, and probabilistic dependency inference, often with fewer learnable parameters than competing baselines. The authors also discuss interpretability, VC-dimension considerations, and biological neuroscience analogies to justify the design, and provide a public toolkit to facilitate replication and extension. Overall, RPN offers a flexible, interpretable, and reusable framework with potential to unify diverse learning paradigms and support continual, multi-modal learning in AI systems.

Abstract

In this paper, we will introduce a novel deep model named Reconciled Polynomial Network (RPN) for deep function learning. RPN has a very general architecture and can be used to build models with various complexities, capacities, and levels of completeness, which all contribute to the correctness of these models. As indicated in the subtitle, RPN can also serve as the backbone to unify different base models into one canonical representation. This includes non-deep models, like probabilistic graphical models (PGMs) - such as Bayesian network and Markov network - and kernel support vector machines (kernel SVMs), as well as deep models like the classic multi-layer perceptron (MLP) and the recent Kolmogorov-Arnold network (KAN). Technically, RPN proposes to disentangle the underlying function to be inferred into the inner product of a data expansion function and a parameter reconciliation function. Together with the remainder function, RPN accurately approximates the underlying functions that governs data distributions. The data expansion functions in RPN project data vectors from the input space to a high-dimensional intermediate space, specified by the expansion functions in definition. Meanwhile, RPN also introduces the parameter reconciliation functions to fabricate a small number of parameters into a higher-order parameter matrix to address the ``curse of dimensionality'' problem caused by the data expansions. Moreover, the remainder functions provide RPN with additional complementary information to reduce potential approximation errors. We conducted extensive empirical experiments on numerous benchmark datasets across multiple modalities, including continuous function datasets, discrete vision and language datasets, and classic tabular datasets, to investigate the effectiveness of RPN.

RPN: Reconciled Polynomial Network Towards Unifying PGMs, Kernel SVMs, MLP and KAN

TL;DR

This work introduces Reconciled Polynomial Network (RPN), a general base model for deep function learning designed to unify probabilistic graphical models, kernel SVMs, MLP, and KAN under a single canonical representation. RPN decomposes the target mapping into the inner product of a data expansion κ( x ) and a parameter reconciliation ψ( w ), plus a remainder π( x ), inspired by Taylor's theorem, enabling interpretable, modular architectures that can be shallow or deeply stacked, wide with multiple heads/channels, or expanded with nested extensions. The paper provides a broad catalog of expansion, reconciliation, and remainder function templates implementable via the TinyBig toolkit, and demonstrates empirically that RPN achieves superior or competitive performance across continuous function fitting, discrete image/text classification, and probabilistic dependency inference, often with fewer learnable parameters than competing baselines. The authors also discuss interpretability, VC-dimension considerations, and biological neuroscience analogies to justify the design, and provide a public toolkit to facilitate replication and extension. Overall, RPN offers a flexible, interpretable, and reusable framework with potential to unify diverse learning paradigms and support continual, multi-modal learning in AI systems.

Abstract

In this paper, we will introduce a novel deep model named Reconciled Polynomial Network (RPN) for deep function learning. RPN has a very general architecture and can be used to build models with various complexities, capacities, and levels of completeness, which all contribute to the correctness of these models. As indicated in the subtitle, RPN can also serve as the backbone to unify different base models into one canonical representation. This includes non-deep models, like probabilistic graphical models (PGMs) - such as Bayesian network and Markov network - and kernel support vector machines (kernel SVMs), as well as deep models like the classic multi-layer perceptron (MLP) and the recent Kolmogorov-Arnold network (KAN). Technically, RPN proposes to disentangle the underlying function to be inferred into the inner product of a data expansion function and a parameter reconciliation function. Together with the remainder function, RPN accurately approximates the underlying functions that governs data distributions. The data expansion functions in RPN project data vectors from the input space to a high-dimensional intermediate space, specified by the expansion functions in definition. Meanwhile, RPN also introduces the parameter reconciliation functions to fabricate a small number of parameters into a higher-order parameter matrix to address the ``curse of dimensionality'' problem caused by the data expansions. Moreover, the remainder functions provide RPN with additional complementary information to reduce potential approximation errors. We conducted extensive empirical experiments on numerous benchmark datasets across multiple modalities, including continuous function datasets, discrete vision and language datasets, and classic tabular datasets, to investigate the effectiveness of RPN.
Paper Structure (107 sections, 3 theorems, 91 equations, 31 figures, 12 tables)

This paper contains 107 sections, 3 theorems, 91 equations, 31 figures, 12 tables.

Key Result

Theorem 1

(Taylor's Theorem): Let $d \ge 1$ be an integer and let function $f: \mathbbm{R} \to \mathbbm{R}$ be $d$ times differentiable at the point $a \in \mathbbm{R}$. As illustrated in Figure fig:taylor_example, then there exists a function $h_d: \mathbbm{R} \to \mathbbm{R}$ such that In the equation, $R_d(x)$ is also normally called the "remainder" term and can be represented as

Figures (31)

  • Figure 1: The timeline illustrates the development of various dominant machine learning base models over the past 70 years, with different colors representing different models. Orange Color: probabilistic graphical models (1980s to mid-2000s); Blue Color: support vector machine (mid 1990s to early 2010s); Green Color: deep learning models (mid-2010s to present); and Purple Color: deep function learning (2020s to present).
  • Figure 2: A comparison of RPN with Bayesian Network, Markov Network, Kernel SVM, MLP and KAN in terms of mathematical theorem foundation, formula and model architecture. In the plots, we represent the learnable parameters and functions in the red color, while the unlearnable/fixed ones are represented in the dark/gray colors instead. The inputs and outputs are represented with the solid circles, while the expansions are represented as the hollow circles instead.
  • Figure 3: An illustration of Taylor's approximation of continuous functions.
  • Figure 4: An illustration of the RPN framework. The left plot illustrates the multi-layer ($K$-layer) architecture of RPN. Each layer involves multi-head for function learning, whose outputs will be fused together. The right plot illustrates the detailed architecture of the RPN head, involving data expansion, multi-channel parameter reconciliation, remainder, and their internal operations. The components with yellow color in dashed lines denote the optional data processing functions (e.g., activation functions and norm functions) for the inputs, expansions and outputs.
  • Figure 5: An illustration of the RPN layer with nested and extended data expansions. Plot (a): multi-layer RPN; Plot (b): single-layer RPN with nested data expansions; Plot (c): multi-head RPN; Plot (d): single-head RPN with extended data expansions.
  • ...and 26 more figures

Theorems & Definitions (6)

  • Definition 1
  • Theorem 1
  • Definition 2
  • Theorem 2
  • Theorem 3
  • Example 1