Erzeugunsgrad, VC-Dimension and Neural Networks with rational activation function
Luis Miguel Pardo, Daniel Sebastián
TL;DR
We extend Heintz's Erzeugungsgrad to constructible sets using refined degree notions ${\rm deg}_{\rm lci}$, ${\rm deg}_z$, and ${\rm deg}_{\pi}$ and prove that the VC-dimension of parameterized constructible classifiers is bounded by the Krull dimension of the parameter space up to logarithmic factors. This yields tight combinatorial bounds linking intersections in affine algebraic geometry with learning-theoretic growth bounds, and enables density results for correct test sequences (CTS) in evasive varieties of positive dimension. The framework is then applied to neural networks with rational activation functions, deriving growth-function bounds, CTS-density results, and constructive reductions that simulate rational activations with polynomial ones, all within a unified algebraic-geometry approach. Overall, the work bridges Intersection Theory and Computational Learning Theory, providing quantitative tools for analyzing the capacity and testability of algebraic classifier families and their neural-network realizations. The results hold in arbitrary characteristic and rely on algebraic-geometric notions that generalize classical real-analytic bounds to the constructible-set setting.
Abstract
The notion of Erzeugungsgrad was introduced by Joos Heintz in 1983 to bound the number of non-empty cells occurring after a process of quantifier elimination. We extend this notion and the combinatorial bounds of Theorem 2 in Heintz (1983) using the degree for constructible sets defined in Pardo-Sebastián (2022). We show that the Erzeugungsgrad is the key ingredient to connect affine Intersection Theory over algebraically closed fields and the VC-Theory of Computational Learning Theory for families of classifiers given by parameterized families of constructible sets. In particular, we prove that the VC-dimension and the Krull dimension are linearly related up to logarithmic factors based on Intersection Theory. Using this relation, we study the density of correct test sequences in evasive varieties. We apply these ideas to analyze parameterized families of neural networks with rational activation function.
