Differentiable Neural Networks with RePU Activation: with Applications to Score Estimation and Isotonic Regression
Guohao Shen, Yuling Jiao, Yuanyuan Lin, Jian Huang
TL;DR
This work develops a rigorous theory for differentiable neural networks with RePU activations, showing that network derivatives admit efficient Mixed RePU representations and establishing complexity and approximation bounds that jointly address functions and their derivatives. It introduces deep score estimation (DSME) and penalized deep isotonic regression (PDIR), providing non-asymptotic excess risk bounds and minimax-optimal rates under $C^s$-smooth targets, with robustness to misspecification and improvements when the data lie near low-dimensional manifolds. A central advance is the simultaneous approximation of $C^s$ functions and their derivatives, enabled by explicit RePU architectures and polynomial-representation techniques, along with a manifold-aware analysis that mitigates the curse of dimensionality. The results have practical implications for high-dimensional derivative-based estimation tasks, diffusion-based generative modeling, and shape-constrained regression, offering theoreticallygrounded, scalable tools for score estimation and isotonic regression in complex settings.
Abstract
We study the properties of differentiable neural networks activated by rectified power unit (RePU) functions. We show that the partial derivatives of RePU neural networks can be represented by RePUs mixed-activated networks and derive upper bounds for the complexity of the function class of derivatives of RePUs networks. We establish error bounds for simultaneously approximating $C^s$ smooth functions and their derivatives using RePU-activated deep neural networks. Furthermore, we derive improved approximation error bounds when data has an approximate low-dimensional support, demonstrating the ability of RePU networks to mitigate the curse of dimensionality. To illustrate the usefulness of our results, we consider a deep score matching estimator (DSME) and propose a penalized deep isotonic regression (PDIR) using RePU networks. We establish non-asymptotic excess risk bounds for DSME and PDIR under the assumption that the target functions belong to a class of $C^s$ smooth functions. We also show that PDIR achieves the minimax optimal convergence rate and has a robustness property in the sense it is consistent with vanishing penalty parameters even when the monotonicity assumption is not satisfied. Furthermore, if the data distribution is supported on an approximate low-dimensional manifold, we show that DSME and PDIR can mitigate the curse of dimensionality.
