Learning a Class of Mixed Linear Regressions: Global Convergence under General Data Conditions
Yujing Liu, Zhixin Liu, Lei Guo
TL;DR
The paper tackles learning in a two-component mixed linear regression with unknown labels by developing a two-step online method that first uses recursive least squares to estimate the regressor direction and then applies an EM-inspired update to recover a scaling between components. A weighted gain and a projection mechanism stabilize the online updates, enabling global convergence and a convergent rate under general, non-PE data conditions. The authors also prove that data clustering performance, measured by misclassification and within-cluster errors, is asymptotically optimal without requiring regressor excitation. Numerical experiments with non-PE data validate the theoretical guarantees, showing vanishing parameter error and improving clustering accuracy, which broadens applicability to stochastic systems with feedback control.
Abstract
Mixed linear regression (MLR) has attracted increasing attention because of its great theoretical and practical importance in capturing nonlinear relationships by utilizing a mixture of linear regression sub-models. Although considerable efforts have been devoted to the learning problem of such systems, i.e., estimating data labels and identifying model parameters, most existing investigations employ the offline algorithm, impose the strict independent and identically distributed (i.i.d.) or persistent excitation (PE) conditions on the regressor data, and provide local convergence results only. In this paper, we investigate the recursive estimation and data clustering problems for a class of stochastic MLRs with two components. To address this inherently nonconvex optimization problem, we propose a novel two-step recursive identification algorithm to estimate the true parameters, where the direction vector and the scaling coefficient of the unknown parameters are estimated by the least squares and the expectation-maximization (EM) principles, respectively. Under a general data condition, which is much weaker than the traditional i.i.d. and PE conditions, we establish the global convergence and the convergence rate of the proposed identification algorithm for the first time. Furthermore, we prove that, without any excitation condition on the regressor data, the data clustering performance including the cumulative mis-classification error and the within-cluster error can be optimal asymptotically. Finally, we provide a numerical example to illustrate the performance of the proposed learning algorithm.
