Nonparametric Regression in Dirichlet Spaces: A Random Obstacle Approach
Prem Talwai, David Simchi-Levi
TL;DR
The paper addresses nonparametric regression on general metric measure Dirichlet spaces, where pointwise evaluation is ill-posed due to the subcritical nature of Dirichlet spaces. It introduces random obstacle renormalization, replacing point evaluations by capacitary means derived from equilibrium potentials and truncated Green functions, which yields a well-posed, representer-theorem–compatible objective. The authors prove that the resulting renormalized ridge estimator achieves rate-optimal, out-of-sample convergence in this broad setting, with adaptive rates governed by geometric and analytic properties through the Dirichlet form. Their approach is measure-agnostic, applies to manifolds and fractals, and advances the theoretical understanding of learning in non-Donsker spaces by linking capacity, Poincaré inequalities, and exit-time bounds to statistical risk. The work also contrasts with ERM on renormalized Dirichlet balls, highlighting a gap in rates that motivates further exploration of curvature and functional-analytic assumptions to close the gap.
Abstract
In this paper, we consider nonparametric estimation over general Dirichlet metric measure spaces. Unlike the more commonly studied reproducing kernel Hilbert space, whose elements may be defined pointwise, a Dirichlet space typically only contain equivalence classes, i.e. its elements are only unique almost everywhere. This lack of pointwise definition presents significant challenges in the context of nonparametric estimation, for example the classical ridge regression problem is ill-posed. In this paper, we develop a new technique for renormalizing the ridge loss by replacing pointwise evaluations with certain \textit{local means} around the boundaries of obstacles centered at each data point. The resulting renormalized empirical risk functional is well-posed and even admits a representer theorem in terms of certain equilibrium potentials, which are truncated versions of the associated Green function, cut-off at a data-driven threshold. We demonstrate that the renormalized ridge estimator is rate-optimal, and derive an adaptive upper bound on its convergence rate that highlights the interplay between the analytic, geometric, and probabilistic properties of the Dirichlet form. Our framework notably does not require the smoothness of the underlying space, and is applicable to both manifold and fractal settings. To the best of our knowledge, this is the first paper to obtain optimal, out-of-sample convergence guarantees in the framework of general metric measure Dirichlet spaces.
