Optimal smoothing parameter in Eilers-Wittaker smoother
Roberto Bernal-Arencibia, Karel Garcia Medina, Ernesto Estevez-Rams, Beatriz Aragon-Fernandez
TL;DR
The paper tackles automatic selection of the regularization parameter $\lambda$ in the Whittaker–Eilers smoothing framework, addressing limitations of standard methods under serially correlated noise. It introduces a spectral-entropy based criterion that computes $H_S = -\sum_q F(q) \log F(q)$ from the Fourier power spectrum and combines it with the residual-derived entropy $H_{\hat{y}}$ to form a two-dimensional descriptor $h_\lambda = (\log H_S, \log H_{\hat{y}})$; the Euclidean distance between successive descriptors $e_\lambda = \| h_{\lambda+1} - h_\lambda \|$ yields an S-curve whose absolute maximum defines the optimal parameter $\lambda_o$. In simulations, this spectral-entropy method more accurately identifies $\lambda$ near the optimum (minimizing mean squared error) than leave-one-out cross-validation or the V-curve, across varying noise levels. Validation on real-world data across finance, astronomy, and chemistry domains demonstrates robustness, producing smoothing curves that balance noise reduction with feature preservation. Overall, the method provides a simple, unsupervised, and effective addition to the smoothing parameter toolkit for large datasets.
Abstract
The Eilers-Whittaker method for data smoothing effectiveness depends on the choice of the regularisation parameter, and automatic selection is a necessity for large datasets. Common methods, such as leave-one-out cross-validation, can perform poorly when serially correlated noise is present. We propose a novel procedure for selecting the control parameter based on the spectral entropy of the residuals. We define an S-curve from the Euclidean distance between points in a plot of the spectral entropy of the residuals versus that of the smoothed signal. The regularisation parameter corresponding to the absolute maximum of this S-curve is chosen as the optimal parameter. Using simulated data, we benchmarked our method against cross-validation and the V-curve. Validation was also performed on diverse experimental data. This robust and straightforward procedure can be a valuable addition to the available selection methods for the Eilers smoother.
