Simplex-FEM Networks (SiFEN): Learning A Triangulated Function Approximator
Chaymae Yahyati, Ismail Lamaakal, Khalid El Makkaoui, Ibrahim Ouahbi, Yassine Maleh
TL;DR
SiFEN introduces a learned simplexes-based predictor that represents $f:\,\\mathbb{R}^d o \R^k$ as a globally $C^r$ finite-element field on a learned simplicial mesh, optionally warped. At inference, only one simplex is active and at most $d+1$ Bernstein–Bezier basis functions are touched, yielding explicit locality, smoothness control, and cache-friendly computation. The mesh, warp, and polynomial coefficients are trained end-to-end with shape regularization, semi-discrete OT coverage, and differentiable topology updates, supporting theoretical FEM-like error rates of $M^{-m/d}$ under standard assumptions. Empirically, SiFEN matches or surpasses MLPs and KANs on synthetic, tabular, and CNN-head benchmarks, improves calibration (lower NLL/Brier and ECE), and reduces inference latency due to locality. The work presents a coherent framework combining geometry, approximation theory, and practical training tricks to deliver a compact, interpretable alternative to dense predictors with strong performance and robustness.
Abstract
We introduce Simplex-FEM Networks (SiFEN), a learned piecewise-polynomial predictor that represents f: R^d -> R^k as a globally C^r finite-element field on a learned simplicial mesh in an optionally warped input space. Each query activates exactly one simplex and at most d+1 basis functions via barycentric coordinates, yielding explicit locality, controllable smoothness, and cache-friendly sparsity. SiFEN pairs degree-m Bernstein-Bezier polynomials with a light invertible warp and trains end-to-end with shape regularization, semi-discrete OT coverage, and differentiable edge flips. Under standard shape-regularity and bi-Lipschitz warp assumptions, SiFEN achieves the classic FEM approximation rate M^(-m/d) with M mesh vertices. Empirically, on synthetic approximation tasks, tabular regression/classification, and as a drop-in head on compact CNNs, SiFEN matches or surpasses MLPs and KANs at matched parameter budgets, improves calibration (lower ECE/Brier), and reduces inference latency due to geometric locality. These properties make SiFEN a compact, interpretable, and theoretically grounded alternative to dense MLPs and edge-spline networks.
