Semi-Supervised Deep Sobolev Regression: Estimation and Variable Selection by ReQU Neural Network
Zhao Ding, Chenguang Duan, Yuling Jiao, Jerry Zhijian Yang
TL;DR
This paper introduces SDORE, a semi-supervised deep Sobolev regressor based on ReQU networks that penalizes gradient norm to jointly estimate a regression function and its gradient. By incorporating unlabeled data to approximate the Sobolev penalty, the method achieves minimax-optimal $L^{2}$ convergence for the function and provides convergence guarantees for the gradient plug-in estimator under domain shift. The authors establish oracle inequalities and rate results for both DORE and its semi-supervised variant, showing advantages of unlabeled data and deriving guidance on regularization and network size. The framework is extended to nonparametric variable selection via derivative-based sparsity, with theoretical guarantees and extensive numerical validation. Overall, the work advances theory for neural network-based Sobolev regression, clarifies when unlabeled data helps, and demonstrates practical benefits for high-dimensional, nonparametric settings.
Abstract
We propose SDORE, a Semi-supervised Deep Sobolev Regressor, for the nonparametric estimation of the underlying regression function and its gradient. SDORE employs deep ReQU neural networks to minimize the empirical risk with gradient norm regularization, allowing the approximation of the regularization term by unlabeled data. Our study includes a thorough analysis of the convergence rates of SDORE in $L^{2}$-norm, achieving the minimax optimality. Further, we establish a convergence rate for the associated plug-in gradient estimator, even in the presence of significant domain shift. These theoretical findings offer valuable insights for selecting regularization parameters and determining the size of the neural network, while showcasing the provable advantage of leveraging unlabeled data in semi-supervised learning. To the best of our knowledge, SDORE is the first provable neural network-based approach that simultaneously estimates the regression function and its gradient, with diverse applications such as nonparametric variable selection. The effectiveness of SDORE is validated through an extensive range of numerical simulations.
