Table of Contents
Fetching ...

Exponential Family Trend Filtering on Lattices

Veeranjaneyulu Sadhanala, Robert Bassett, James Sharpnack, Daniel J. McDonald

TL;DR

The paper develops trend-filtering methods for exponential-family data on lattice graphs, motivated by massive climate datasets. It introduces Penalized MLE with a null-space penalty and Mean Trend Filter, analyzes their excess KL-risk under subexponential, heteroskedastic noise, and derives near-minimax rates under canonical scaling. An efficient linearized ADMM algorithm is provided, along with a risk-based tuning parameter selection framework. Empirical studies on simulations and real data (UC COVID-19 hospitalizations and temperature variability) demonstrate the methods’ practical utility and adaptive smoothing capabilities in high-dimensional grid settings.

Abstract

Trend filtering is a modern approach to nonparametric regression that is more adaptive to local smoothness than splines or similar basis procedures. Existing analyses of trend filtering focus on estimating a function corrupted by homoskedastic Gaussian noise, but our work extends this technique to general exponential family distributions. This extension is motivated by the need to study massive, gridded climate data derived from polar-orbiting satellites. We present algorithms tailored to large problems, theoretical results for general exponential family likelihoods, and principled methods for tuning parameter selection without excess computation.

Exponential Family Trend Filtering on Lattices

TL;DR

The paper develops trend-filtering methods for exponential-family data on lattice graphs, motivated by massive climate datasets. It introduces Penalized MLE with a null-space penalty and Mean Trend Filter, analyzes their excess KL-risk under subexponential, heteroskedastic noise, and derives near-minimax rates under canonical scaling. An efficient linearized ADMM algorithm is provided, along with a risk-based tuning parameter selection framework. Empirical studies on simulations and real data (UC COVID-19 hospitalizations and temperature variability) demonstrate the methods’ practical utility and adaptive smoothing capabilities in high-dimensional grid settings.

Abstract

Trend filtering is a modern approach to nonparametric regression that is more adaptive to local smoothness than splines or similar basis procedures. Existing analyses of trend filtering focus on estimating a function corrupted by homoskedastic Gaussian noise, but our work extends this technique to general exponential family distributions. This extension is motivated by the need to study massive, gridded climate data derived from polar-orbiting satellites. We present algorithms tailored to large problems, theoretical results for general exponential family likelihoods, and principled methods for tuning parameter selection without excess computation.
Paper Structure (52 sections, 36 theorems, 232 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 52 sections, 36 theorems, 232 equations, 6 figures, 4 tables, 1 algorithm.

Key Result

Lemma 1

A basis for the null space of $D$ is given by the family of polynomials where $x_j$ are the coordinates of the observations along the $j^{th}$ dimension. The dimension of the null space is $\mathrm{nullity}(D) = \prod_{j=1}^d (k_j+1)$.

Figures (6)

  • Figure 1: Estimates of the instantaneous temperature variance for 1 January 2010 over Canada. The top row shows the absolute centered data, 0th-order trend filter, and 2nd-order trend filter, in the latter 2 cases, with reasonable values of the tuning parameter. The bottom row shows the 1st-order trend filter for different tuning parameters, with the left most map, labeled "optimal", corresponding to the estimate when the degrees-of-freedom is chosen by minimizing an unbiased risk estimate.
  • Figure 2: Estimation accuracy for both types of trend filters. The left column (panel A) compares the estimators when the mean is smooth. The right column (panel B) compares the estimators when the natural parameter is smooth. Solid lines show the average error across replications while the points show the error for each replication.
  • Figure 3: Estimates from both trend filters for the 4 scenarios when $n=104$.
  • Figure 4: Estimated daily hospitalization rate due to COVID-19 by 5 year age group and week in five UC hospitals. We apply $k=1$ trend filtering with the Poisson exponential family (left) to the raw count data (right).
  • Figure 5: Panel A shows the change in average temperatures observed in the northern hemisphere from the 1960s relative to the 2000s in degrees Celsius. Panel B shows the change in estimated standard deviation (using the KL trend filter with $k=1$ in the temporal dimension and $k=2$ over space) from the 1960s relative to the 2000s. Standard deviations were estimated at each spatio-temporal grid location before being averaged separately over winter/summer months over the appropriate decade.
  • ...and 1 more figures

Theorems & Definitions (68)

  • Lemma 1
  • Lemma 2
  • Remark 1: Degenerate Poisson example
  • Proposition 1
  • Theorem 1
  • Corollary 1.1
  • Corollary 1.2
  • Corollary 1.3
  • Theorem 2
  • Theorem 3
  • ...and 58 more