Table of Contents
Fetching ...

Regularization for the Approximation of 2D Set of Points via the Length of the Curve

Majid E. Abbasov, Anna I. Belenok

TL;DR

The paper addresses fitting a curve to a 2D point set with a length-based regularization to mitigate spikes. It shows the minimizer lies in the piecewise-linear class and derives an explicit upper bound $\bar{\alpha} = (\max_i y_i - \min_i y_i)/\varepsilon$ that separates nontrivial fits from the trivial average-line solution. Theoretical results are supported by lemmas and a vector-angle formulation, and numerical experiments compare the proposed method to Ridge and Lasso, illustrating how increasing $\alpha$ drives the solution toward the average line. The work provides a principled, interpretable mechanism for regularizing curve-length in 2D point-set approximation with practical implications for data analysis.

Abstract

We study the problem of approximation of 2D set of points. Such type of problems always occur in physical experiments, econometrics, data analysis and other areas. The often problems of outliers or spikes usually make researchers to apply regularization techniques, such as Lasso, Ridge or Elastic Net. These approaches always employ penalty coefficient. So the important question of evaluation of the upper bound for the coefficient arises. In the current study we propose a novel way of regularization and derive the upper bound for the used penalty coefficient. First the problem in a general form is stated. The solution is sought in the class of piecewise continuously differentiable functions. It is shown that the optimal solution belongs to the class of piecewise linear functions. So the problem of obtaining the piecewise linear approximation that fits 2D set of point the best is stated. We show that the optimal solution is trivial and tends to a line as penalty coefficient tends to infinity. Then the main result is stated and proved. It provides the upper bound for the penalty coefficient prior to which the optimal solution differs from the line more than some pregiven positive number. We also demonstrate the proposed ideas on numerical examples which include comparison with other regularization approaches.

Regularization for the Approximation of 2D Set of Points via the Length of the Curve

TL;DR

The paper addresses fitting a curve to a 2D point set with a length-based regularization to mitigate spikes. It shows the minimizer lies in the piecewise-linear class and derives an explicit upper bound that separates nontrivial fits from the trivial average-line solution. Theoretical results are supported by lemmas and a vector-angle formulation, and numerical experiments compare the proposed method to Ridge and Lasso, illustrating how increasing drives the solution toward the average line. The work provides a principled, interpretable mechanism for regularizing curve-length in 2D point-set approximation with practical implications for data analysis.

Abstract

We study the problem of approximation of 2D set of points. Such type of problems always occur in physical experiments, econometrics, data analysis and other areas. The often problems of outliers or spikes usually make researchers to apply regularization techniques, such as Lasso, Ridge or Elastic Net. These approaches always employ penalty coefficient. So the important question of evaluation of the upper bound for the coefficient arises. In the current study we propose a novel way of regularization and derive the upper bound for the used penalty coefficient. First the problem in a general form is stated. The solution is sought in the class of piecewise continuously differentiable functions. It is shown that the optimal solution belongs to the class of piecewise linear functions. So the problem of obtaining the piecewise linear approximation that fits 2D set of point the best is stated. We show that the optimal solution is trivial and tends to a line as penalty coefficient tends to infinity. Then the main result is stated and proved. It provides the upper bound for the penalty coefficient prior to which the optimal solution differs from the line more than some pregiven positive number. We also demonstrate the proposed ideas on numerical examples which include comparison with other regularization approaches.

Paper Structure

This paper contains 5 sections, 3 theorems, 45 equations, 6 figures.

Key Result

Lemma 1

Let Then ${\underset{a\in \Omega}{\operatorname{argmin}} {J_{\alpha }}(a)=\overline{a}\ },$ where $\overline{a}=\left(\overline{a}_1,\dots,\overline{a}_n\right)$, In other words, on the class of lines $J_{\alpha }\left(a\right)$ reaches a minimum on a straight line parallel to the $Ox$ axis. So, this is the average line for the ordinates of all points of set $X$.

Figures (6)

  • Figure 1: Illustration to the proof of Lemma \ref{['AM_BA_lem2']}
  • Figure 2: The angles which characterize the slopes of the broken line links
  • Figure 3: The optimal broken line (orange) and the averege line (blue) in Example \ref{['AM_BA_exmpl1']}
  • Figure 4: The optimal broken line (orange) with $\overline{\alpha}$ values 500 (Fig. a), 750 (Fig. b), 1000 (Fig. c), 1250 (Fig. d) and the average line (blue) in Example \ref{['AM_BA_exmpl2']}
  • Figure 5: The optimal broken line constructed by the proposed modification (orange), Ridge regression (blue), Lasso regression (green) at $\overline{\alpha}$ values $10$ (Fig. a), $25$ (Fig. b), $50$ (Fig. c), $250$ (Fig. d) in Example \ref{['AM_BA_exmpl3']}
  • ...and 1 more figures

Theorems & Definitions (10)

  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Theorem 1
  • proof
  • Example 1
  • Example 2
  • Example 3
  • Example 4