Table of Contents
Fetching ...

Convex-area-wise Linear Regression and Algorithms for Data Analysis

Bohan Lyu, Jianzhong Li

TL;DR

An equivalent mixed integer programming problem of CALR is introduced which can be approximately solved using existing optimization solvers.

Abstract

This paper introduces a new type of regression methodology named as Convex-Area-Wise Linear Regression(CALR), which separates given datasets by disjoint convex areas and fits different linear regression models for different areas. This regression model is highly interpretable, and it is able to interpolate any given datasets, even when the underlying relationship between explanatory and response variables are non-linear and discontinuous. In order to solve CALR problem, 3 accurate algorithms are proposed under different assumptions. The analysis of correctness and time complexity of the algorithms are given, indicating that the problem can be solved in $o(n^2)$ time accurately when the input datasets have some special features. Besides, this paper introduces an equivalent mixed integer programming problem of CALR which can be approximately solved using existing optimization solvers.

Convex-area-wise Linear Regression and Algorithms for Data Analysis

TL;DR

An equivalent mixed integer programming problem of CALR is introduced which can be approximately solved using existing optimization solvers.

Abstract

This paper introduces a new type of regression methodology named as Convex-Area-Wise Linear Regression(CALR), which separates given datasets by disjoint convex areas and fits different linear regression models for different areas. This regression model is highly interpretable, and it is able to interpolate any given datasets, even when the underlying relationship between explanatory and response variables are non-linear and discontinuous. In order to solve CALR problem, 3 accurate algorithms are proposed under different assumptions. The analysis of correctness and time complexity of the algorithms are given, indicating that the problem can be solved in time accurately when the input datasets have some special features. Besides, this paper introduces an equivalent mixed integer programming problem of CALR which can be approximately solved using existing optimization solvers.
Paper Structure (15 sections, 11 theorems, 7 equations, 3 figures, 7 algorithms)

This paper contains 15 sections, 11 theorems, 7 equations, 3 figures, 7 algorithms.

Key Result

lemma thmcounterlemma

Given any finite data $D=\{(x_i,y_i)|i=1,2,\cdots,n\}$, and $y_i\neq y_j \Rightarrow x_i \neq x_j$, there exists a PLDC function interpolating $D$.

Figures (3)

  • Figure 1: An illustration of multiple-model linear regression. Modelling $y$ with explain variables $(x_1,x_2)$, three local models with three subsets used to best fit the dataset.
  • Figure 2: An illustration of CALF, with $H=\{(f_1,S_1),(f_2,S_2),(f_0,S_0)\}$.
  • Figure 3: Using one linear regression model and 7 regression models for TBI data, has totally different prediction accuracy.

Theorems & Definitions (27)

  • definition thmcounterdefinition: Convex-Area-Wise Linear Function
  • definition thmcounterdefinition: PLDC function
  • lemma thmcounterlemma
  • theorem thmcountertheorem
  • proof
  • definition thmcounterdefinition: Linear Regression Problem
  • definition thmcounterdefinition: Convex-area-wise Linear Regression Problem
  • definition thmcounterdefinition: Decision problem of CALR
  • definition thmcounterdefinition: Optimizing CALR problem
  • lemma thmcounterlemma
  • ...and 17 more