Table of Contents
Fetching ...

The Curvature Rate λ: A Scalar Measure of Input-Space Sharpness in Neural Networks

Jacob Poschl

TL;DR

The paper introduces λ, a scalar curvature measure defined in input space as the exponential growth rate of higher-order input derivatives, estimated from low-order derivatives during training. By showing λ equals −log R for analytic functions and log Ω for bandlimited signals, the work unifies classical notions of smoothness and spectral content and extends this perspective to neural networks where λ tracks decision-boundary complexity. A Curvature Rate Regularization (CRR) is proposed to control λ directly, yielding flatter input-space geometry and improved calibration with minimal accuracy loss, and is competitive with Sharpness-Aware Minimization (SAM). This functional, parameterization-invariant framing offers a principled tool for characterizing and shaping neural representations, with implications for generalization, calibration, and robustness across tasks and architectures.

Abstract

Curvature influences generalization, robustness, and how reliably neural networks respond to small input perturbations. Existing sharpness metrics are typically defined in parameter space (e.g., Hessian eigenvalues) and can be expensive, sensitive to reparameterization, and difficult to interpret in functional terms. We introduce a scalar curvature measure defined directly in input space: the curvature rate λ, given by the exponential growth rate of higher-order input derivatives. Empirically, λ is estimated as the slope of log ||D^n f|| versus n for small n. This growth-rate perspective unifies classical analytic quantities: for analytic functions, λ corresponds to the inverse radius of convergence, and for bandlimited signals, it reflects the spectral cutoff. The same principle extends to neural networks, where λ tracks the emergence of high-frequency structure in the decision boundary. Experiments on analytic functions and neural networks (Two Moons and MNIST) show that λ evolves predictably during training and can be directly shaped using a simple derivative-based regularizer, Curvature Rate Regularization (CRR). Compared to Sharpness-Aware Minimization (SAM), CRR achieves similar accuracy while yielding flatter input-space geometry and improved confidence calibration. By grounding curvature in differentiation dynamics, λ provides a compact, interpretable, and parameterization-invariant descriptor of functional smoothness in learned models.

The Curvature Rate λ: A Scalar Measure of Input-Space Sharpness in Neural Networks

TL;DR

The paper introduces λ, a scalar curvature measure defined in input space as the exponential growth rate of higher-order input derivatives, estimated from low-order derivatives during training. By showing λ equals −log R for analytic functions and log Ω for bandlimited signals, the work unifies classical notions of smoothness and spectral content and extends this perspective to neural networks where λ tracks decision-boundary complexity. A Curvature Rate Regularization (CRR) is proposed to control λ directly, yielding flatter input-space geometry and improved calibration with minimal accuracy loss, and is competitive with Sharpness-Aware Minimization (SAM). This functional, parameterization-invariant framing offers a principled tool for characterizing and shaping neural representations, with implications for generalization, calibration, and robustness across tasks and architectures.

Abstract

Curvature influences generalization, robustness, and how reliably neural networks respond to small input perturbations. Existing sharpness metrics are typically defined in parameter space (e.g., Hessian eigenvalues) and can be expensive, sensitive to reparameterization, and difficult to interpret in functional terms. We introduce a scalar curvature measure defined directly in input space: the curvature rate λ, given by the exponential growth rate of higher-order input derivatives. Empirically, λ is estimated as the slope of log ||D^n f|| versus n for small n. This growth-rate perspective unifies classical analytic quantities: for analytic functions, λ corresponds to the inverse radius of convergence, and for bandlimited signals, it reflects the spectral cutoff. The same principle extends to neural networks, where λ tracks the emergence of high-frequency structure in the decision boundary. Experiments on analytic functions and neural networks (Two Moons and MNIST) show that λ evolves predictably during training and can be directly shaped using a simple derivative-based regularizer, Curvature Rate Regularization (CRR). Compared to Sharpness-Aware Minimization (SAM), CRR achieves similar accuracy while yielding flatter input-space geometry and improved confidence calibration. By grounding curvature in differentiation dynamics, λ provides a compact, interpretable, and parameterization-invariant descriptor of functional smoothness in learned models.

Paper Structure

This paper contains 44 sections, 9 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Training dynamics reveal that unregularized models continue sharpening after generalization plateaus. Solid lines show $\lambda$ evolution, dashed lines show test error, across 50 training epochs on Two Moons with 30% label noise (20 seeds per condition, shaded regions show 95% CI). Both baseline and regularized models develop similar sharpness during initial learning (epochs 0--15), as test error drops from 36% to 31%. After epoch 15, test error plateaus for both conditions, but baseline $\lambda$ continues climbing monotonically from 1.3 to 1.85 (42% increase), while regularized $\lambda$ stabilizes at 1.2--1.3. This late-stage sharpening without generalization improvement is the signature of overfitting under label noise. Curvature Rate Regularization (orange, scale $= 0.003$) prevents this unnecessary complexity, achieving 31% reduction in final $\lambda$ while maintaining equivalent test error (0.314 vs 0.310). The diverging confidence bands after epoch 20 demonstrate this is a systematic effect rather than stochastic variation.