Notes on Kernel Methods in Machine Learning
Diego Armando Pérez-Rosero, Danna Valentina Salazar-Dubois, Juan Camilo Lugo-Rojas, Andrés Marino Álvarez-Meza, Germán Castellanos-Dominguez
TL;DR
Notes on Kernel Methods in Machine Learning builds a rigorous bridge between probability theory and nonlinear learning by embedding distributions into reproducing kernel Hilbert spaces. It develops the theory of positive definite kernels and RKHS, introduces covariance and Hilbert–Schmidt operators, and connects these constructs to kernel density estimation, mean embeddings, and the Maximum Mean Discrepancy. The framework yields geometric interpretations of estimation, dependence, and information measures in high- or infinite-dimensional feature spaces, providing a foundation for Gaussian processes, kernel Bayesian inference, and functional-analytic approaches to modern ML. The synthesis offers concrete tools (KDE, MMD, kernel PCA) and a conceptual pathway for extending classical statistics into nonlinear, distribution-aware kernel methods with strong theoretical guarantees.
Abstract
These notes provide a self-contained introduction to kernel methods and their geometric foundations in machine learning. Starting from the construction of Hilbert spaces, we develop the theory of positive definite kernels, reproducing kernel Hilbert spaces (RKHS), and Hilbert-Schmidt operators, emphasizing their role in statistical estimation and representation of probability measures. Classical concepts such as covariance, regression, and information measures are revisited through the lens of Hilbert space geometry. We also introduce kernel density estimation, kernel embeddings of distributions, and the Maximum Mean Discrepancy (MMD). The exposition is designed to serve as a foundation for more advanced topics, including Gaussian processes, kernel Bayesian inference, and functional analytic approaches to modern machine learning.
