Table of Contents
Fetching ...

HAL-MLE Log-Splines Density Estimation (Part I: Univariate)

Yilong Hou, Zhengpu Zhao, Yi Li, Mark van der Laan

TL;DR

The paper develops HAL-MLE with a log-spline link for univariate density estimation under a variational penalty defined by bounded sectional variation norm, linking HAL to classical TV-penalized approaches. It proves univariate HAL-MLE is asymptotically linear, pointwise normal, and converges uniformly at rate $n^{-(k+1)/(2k+3)}$ (up to log factors for $k\ge1$), with plan to extend to higher dimensions. It situates HAL-MLE within NPMLE, provides a density-parameter delta-method variance estimator, and demonstrates asymptotic efficiency for pathwise-differentiable estimands via plug-in HAL-MLE and HAL-TMLE. The work includes optimization strategies and extensive simulations comparing HAL-MLE to logspline, trend filtering, and KDE, plus a galaxy velocity case study illustrating practical inference and variance estimation. Overall, the framework offers a unified, theoretically grounded approach to TV-penalized density estimation, with robust inference guarantees for nonparametric densities and related estimands.

Abstract

We study nonparametric maximum likelihood estimation of probability densities under a total variation (TV) type penalty, sectional variation norm (also named as Hardy-Krause variation). TV regularization has a long history in regression and density estimation, including results on $L^2$ and KL divergence convergence rates. Here, we revisit this task using the Highly Adaptive Lasso (HAL) framework. We formulate a HAL-based maximum likelihood estimator (HAL-MLE) using the log-spline link function from \citet{kooperberg1992logspline}, and show that in the univariate setting the bounded sectional variation norm assumption underlying HAL coincides with the classical bounded TV assumption. This equivalence directly connects HAL-MLE to existing TV-penalized approaches such as local adaptive splines \citep{mammen1997locally}. We establish three new theoretical results: (i) the univariate HAL-MLE is asymptotically linear, (ii) it admits pointwise asymptotic normality, and (iii) it achieves uniform convergence at rate $n^{-(k+1)/(2k+3)}$ up to logarithmic factors for the smoothness order $k \geq 1$. These results extend existing results from \citet{van2017uniform}, which previously guaranteed only uniform consistency without rates when $k=0$. We will include the uniform convergence for general dimension $d$ in the follow-up work of this paper. The intention of this paper is to provide a unified framework for the TV-penalized density estimation methods, and to connect the HAL-MLE to the existing TV-penalized methods in the univariate case, despite that the general HAL-MLE is defined for multivariate cases.

HAL-MLE Log-Splines Density Estimation (Part I: Univariate)

TL;DR

The paper develops HAL-MLE with a log-spline link for univariate density estimation under a variational penalty defined by bounded sectional variation norm, linking HAL to classical TV-penalized approaches. It proves univariate HAL-MLE is asymptotically linear, pointwise normal, and converges uniformly at rate (up to log factors for ), with plan to extend to higher dimensions. It situates HAL-MLE within NPMLE, provides a density-parameter delta-method variance estimator, and demonstrates asymptotic efficiency for pathwise-differentiable estimands via plug-in HAL-MLE and HAL-TMLE. The work includes optimization strategies and extensive simulations comparing HAL-MLE to logspline, trend filtering, and KDE, plus a galaxy velocity case study illustrating practical inference and variance estimation. Overall, the framework offers a unified, theoretically grounded approach to TV-penalized density estimation, with robust inference guarantees for nonparametric densities and related estimands.

Abstract

We study nonparametric maximum likelihood estimation of probability densities under a total variation (TV) type penalty, sectional variation norm (also named as Hardy-Krause variation). TV regularization has a long history in regression and density estimation, including results on and KL divergence convergence rates. Here, we revisit this task using the Highly Adaptive Lasso (HAL) framework. We formulate a HAL-based maximum likelihood estimator (HAL-MLE) using the log-spline link function from \citet{kooperberg1992logspline}, and show that in the univariate setting the bounded sectional variation norm assumption underlying HAL coincides with the classical bounded TV assumption. This equivalence directly connects HAL-MLE to existing TV-penalized approaches such as local adaptive splines \citep{mammen1997locally}. We establish three new theoretical results: (i) the univariate HAL-MLE is asymptotically linear, (ii) it admits pointwise asymptotic normality, and (iii) it achieves uniform convergence at rate up to logarithmic factors for the smoothness order . These results extend existing results from \citet{van2017uniform}, which previously guaranteed only uniform consistency without rates when . We will include the uniform convergence for general dimension in the follow-up work of this paper. The intention of this paper is to provide a unified framework for the TV-penalized density estimation methods, and to connect the HAL-MLE to the existing TV-penalized methods in the univariate case, despite that the general HAL-MLE is defined for multivariate cases.
Paper Structure (103 sections, 12 theorems, 140 equations, 37 figures, 8 tables)

This paper contains 103 sections, 12 theorems, 140 equations, 37 figures, 8 tables.

Key Result

Proposition 2.1

Any $f\in D^{(k)}_U([0,1])$ could be represented as follows:

Figures (37)

  • Figure 1: Optimization algorithm comparison for Truncated Normal (2nd order basis): knot selection per iteration (top left), knot selection per FLOP (top right), loss convergence per iteration (bottom left), and loss convergence per FLOP (bottom right).
  • Figure 2: Six DGPs: (a) TN, (b) GS3, (c) GA3, (d) GS5, (e) Step, (f) Sine.
  • Figure 3: Uniform– convergence sup– norm error (log– scale) by DGP. Panels (row– wise, left to right): (a) TN, (b) GS3, (c) GA3, (d) GS5, (e) Step, (f) Sine.
  • Figure 4: Representative asymptotic– efficiency comparison (DGP: GA3). Columns show four statistical estimands (Mean, Median, Survival at 0.5, Second Moment) and rows show three metrics (MSE, Variance, $|$Bias$|$/SE). Curves compare the asymptotically efficient estimator (green), HAL– MLE (blue), and HAL– TMLE (red).
  • Figure 5: Bias at $n=800$ across six DGPs (columns) for five estimators (legend).
  • ...and 32 more figures

Theorems & Definitions (37)

  • Definition 2.1: Càdlàg Functions with Bounded Sectional Variational Norm (BSVN)
  • Definition 2.2: The Function with k-th order BSVN
  • Proposition 2.1: Univariate Exact Representation
  • Remark 2.1
  • Remark 3.1
  • Remark 4.1
  • Remark 4.2
  • Theorem 4.1: $L^2$ convergence of HAL-MLE
  • Theorem 4.2: Asymptotic Linearity of HAL-MLE
  • Remark 4.3: The Necessity for Cross-Validation and Undersmoothing
  • ...and 27 more