An entropy-based approach for a robust least squares spline approximation
Luigi Brugnano, Domenico Giordano, Felice Iavernaro, Giorgia Rubino
TL;DR
This work introduces a maximum-entropy weighted least squares (MEWLS) framework for robust spline approximation, where data-point weights form a probability distribution and are chosen to maximize entropy under a prescribed weighted MSE $\overline{E^2}$. The method yields a nonlinear, yet deterministic, optimization that downweights outliers through weights $w_i\propto\exp(-\lambda_2\,||f(t_i,c)-y_i||^2)$ and smoothly transitions from ordinary least squares to MEWLS via a continuation on $\overline{E^2}$. A hybrid iterative solver ties the spline coefficients, weights, and Lagrange multiplier together, enabling automatic outlier detection and scoring. Numerical experiments on synthetic curves and real data (HR diagrams, rail-track detection, and environmental O$_3$ series) demonstrate MEWLS’s improved robustness and its potential as a preprocessing tool in data-intensive pipelines.
Abstract
We consider the weighted least squares spline approximation of a noisy dataset. By interpreting the weights as a probability distribution, we maximize the associated entropy subject to the constraint that the mean squared error is prescribed to a desired (small) value. Acting on this error yields a robust regression method that automatically detects and removes outliers from the data during the fitting procedure, by assigning them a very small weight. We discuss the use of both spline functions and spline curves. A number of numerical illustrations have been included to disclose the potentialities of the maximal-entropy approach in different application fields.
