On the Conditions for Domain Stability for Machine Learning: a Mathematical Approach
Gabriel Pedroza
TL;DR
Addresses the need for a rigorous notion of stability in ML by modeling classifiers as functions on a metric space $(S,d)$ and analyzing how domain topology affects stability. Proposes a formal stability definition for a classifier $M$ on a partition $\{D_i\}$ of $D\subset S$ via stable points $x_y\in D$ with $M(x_y)=y$ and a neighborhood $B(x_y,\delta)\subset D$ on which $M$ remains constant. Shows that when either $D$ or its complement $D^{c}$ is dense, stable points cannot exist, while open and bounded domains support stability, and establishes equivalences: stability $\Leftrightarrow$ accumulation points $\Leftrightarrow$ accumulation-series. Argues that these equivalences enable discrete, finite-precision testing and positions the framework as a foundation for provable stability criteria and future work on discrete algorithms.
Abstract
This work proposes a mathematical approach that (re)defines a property of Machine Learning models named stability and determines sufficient conditions to validate it. Machine Learning models are represented as functions, and the characteristics in scope depend upon the domain of the function, what allows us to adopt topological and metric spaces theory as a basis. Finally, this work provides some equivalences useful to prove and test stability in Machine Learning models. The results suggest that whenever stability is aligned with the notion of function smoothness, then the stability of Machine Learning models primarily depends upon certain topological, measurable properties of the classification sets within the ML model domain.
