Reproducing kernel methods for machine learning, PDEs, and statistics
Philippe G. LeFloch, Jean-Marc Mercier, Shohruh Miryusupov
TL;DR
This work develops a unified RKHS framework fused with optimal transport to advance kernel methods across ML, PDEs, and finance. It builds a theory stack—discrete/continuous RKHS, kernel engineering, and discrete kernel operators—then leverages OT concepts (Monge/Kantorovich, GW/GM, Sinkhorn) to design scalable, mesh-free discretizations and generative models. Across applications, it demonstrates kernel ridge regression and related RKHS tools for regression, classification, clustering, and generation, and shows how OT-inspired maps yield sample-efficient, data-driven models with reproducibility guarantees. The practical impact lies in reproducible benchmarks, scalable kernels, and a cohesive pathway from kernel theory to industrial computational physics and mathematical finance tasks. The compilation also provides CodPy-based tooling and a companion site with runnable Python code to diffusion kernel-based RKHS methods.
Abstract
This monograph develops a unified, application-driven framework for kernel methods grounded in reproducing kernel Hilbert spaces (RKHS) and optimal transport (OT). Part I lays the theoretical and numerical foundations on positive-definite kernels; discrete and continuous RKHS; kernel engineering and scaling maps; error assessment via kernel discrepancy/maximum mean discrepancy (MMD); and a systematic operator view of kernels. In this viewpoint, projection, gradient, divergence, and Laplace-Beltrami operators are built directly from kernels, enabling discrete analogues of differential operators and variational tools that connect learning with PDE-style modeling. Part II turns to practice across four domains. In machine learning, we treat supervised and unsupervised tasks, then develop RKHS-based generative modeling, contrasting density and projection approaches and enhancing them with OT and scalable, combinatorial assignments. We introduce clustering strategies that reduce computational burden and support large-scale regression and transport. In physics-informed modeling, we present mesh-free kernel discretizations for elliptic and time-dependent PDEs, discuss automatic differentiation, and propose high-order discrete approximations. In reinforcement learning, we formulate kernel Q-learning and non-parametric HJB methods, and show how kernel operators yield sample-efficient baselines on continuous-state, discrete-action tasks. In mathematical finance, we build nonparametric time-series models and market generators, study benchmarking and extrapolation for pricing, and apply the framework to stress testing and portfolio methods.
