Table of Contents
Fetching ...

Maximum Entropy Least Squares Solutions of Overdetermined Linear Systems

Felice Iavernaro, Monica Lazzo, Lorenzo Pisani

Abstract

We investigate the theoretical foundations of a recently introduced entropy-based formulation of weighted least squares for the approximation of overdetermined linear systems, motivated by robust data fitting in the presence of sparse gross errors. The weight vector is interpreted as a discrete probability distribution and is determined by maximizing Shannon entropy under normalization and a prescribed mean squared error (MSE) constraint. Unlike classical ordinary least squares, where the error level is an output of the minimization process, here the MSE value plays the role of a control parameter, and entropy selects the least biased weight distribution achieving the prescribed accuracy. The resulting optimization problem is nonconvex due to the nonlinear coupling between the weights and the solution induced by the residual constraint. We analyze the associated optimality system and characterize stationary points through first- and second-order conditions. We prove the existence and local uniqueness of a smooth branch of entropy-maximizing configurations emanating from the ordinary least squares solution and establish its global continuation under suitable nondegeneracy conditions. Furthermore, we investigate the asymptotic regime as the prescribed MSE tends to zero and show that, under appropriate assumptions, the limiting configuration concentrates on a largest subset of data consistent with the linear model, thus suppressing the influence of outliers. Two numerical experiments illustrate the theoretical findings and confirm the robustness properties of the method.

Maximum Entropy Least Squares Solutions of Overdetermined Linear Systems

Abstract

We investigate the theoretical foundations of a recently introduced entropy-based formulation of weighted least squares for the approximation of overdetermined linear systems, motivated by robust data fitting in the presence of sparse gross errors. The weight vector is interpreted as a discrete probability distribution and is determined by maximizing Shannon entropy under normalization and a prescribed mean squared error (MSE) constraint. Unlike classical ordinary least squares, where the error level is an output of the minimization process, here the MSE value plays the role of a control parameter, and entropy selects the least biased weight distribution achieving the prescribed accuracy. The resulting optimization problem is nonconvex due to the nonlinear coupling between the weights and the solution induced by the residual constraint. We analyze the associated optimality system and characterize stationary points through first- and second-order conditions. We prove the existence and local uniqueness of a smooth branch of entropy-maximizing configurations emanating from the ordinary least squares solution and establish its global continuation under suitable nondegeneracy conditions. Furthermore, we investigate the asymptotic regime as the prescribed MSE tends to zero and show that, under appropriate assumptions, the limiting configuration concentrates on a largest subset of data consistent with the linear model, thus suppressing the influence of outliers. Two numerical experiments illustrate the theoretical findings and confirm the robustness properties of the method.
Paper Structure (12 sections, 12 theorems, 164 equations, 6 figures)

This paper contains 12 sections, 12 theorems, 164 equations, 6 figures.

Key Result

Theorem 1

Assume that the vector $|r^*| = |Ax^*-b|$ is not constant. Then, there exist $\Delta > 0$ and unique functions $\lambda(\mathop{\mathrm{E}}\nolimits)$, $\mu(\mathop{\mathrm{E}}\nolimits)$, $w(\mathop{\mathrm{E}}\nolimits)$, and $x(\mathop{\mathrm{E}}\nolimits)$, defined on the interval $(\mathop{\ma

Figures (6)

  • Figure 1: Blue circles: inliers lying exactly on the line $y=\tfrac{1}{2} x$. Red asterisks: outliers. Yellow dashed line: OLS regression with uniform weights on the full dataset. Green solid line: MEWLS regression at the final MSE level, coinciding with the outlier-free solution. Dotted gray line: intermediate MEWLS configuration corresponding to $E=3\cdot 10^{-2}$.
  • Figure 2: Evolution of the twenty weights $w_i(E)$ as the MSE decreases (from right to left). The weights associated with the outliers decay to zero, while the remaining weights converge to the uniform distribution $1/10$, in agreement with Proposition \ref{['prop:weights_on_S_uniform']}.
  • Figure 3: Left: value function $\mathcal{V}(E)$, exhibiting strict concavity as predicted by Proposition \ref{['prop:mu_monotone']}. Right: Lagrange multiplier $\mu(E)$ plotted with a semilogarithmic scale on the horizontal axis, confirming positivity, strict monotonicity, and the asymptotic behavior described in Lemma \ref{['lemma:mu-log']}.
  • Figure 4: Left: smallest eigenvalue of the matrix $\widehat{S}$ along the solution branch, confirming the positivity property stated in Proposition \ref{['prop:B_positive_on_V']}. Right: noisy-inlier experiment. Yellow dashed line: OLS regression in the presence of outliers. Green solid line: MEWLS regression obtained after reducing the MSE to the outlier-free level, showing effective suppression of the outliers.
  • Figure 5: Left: initial uniform-weight regression line (dashed yellow) and MEWLS regression line (solid green) for the symmetric dataset. Because of symmetry with respect to $y=1/2$, the regression line remains unchanged along the branch as $E$ decreases. Right: evolution of the optimal weights along the solution branch. The weights associated with the points farthest from $y=1/2$ (red asterisks) decrease monotonically (red solid line), whereas those corresponding to the closer points (blue circles) increase symmetrically (blue solid line), illustrating the entropy redistribution process.
  • ...and 1 more figures

Theorems & Definitions (27)

  • Remark 1
  • Theorem 1
  • proof
  • Remark 2
  • Lemma 1
  • proof
  • Proposition 1: Monotonicity and positivity of $\mu(E)$
  • proof
  • Proposition 2
  • proof
  • ...and 17 more