Strong identifiability and parameter learning in regression with heterogeneous response

Dat Do; Linh Do; XuanLong Nguyen

Strong identifiability and parameter learning in regression with heterogeneous response

Dat Do, Linh Do, XuanLong Nguyen

Abstract

Mixtures of regression are a powerful class of models for regression learning with respect to a highly uncertain and heterogeneous response variable of interest. In addition to being a rich predictive model for the response given some covariates, the parameters in this model class provide useful information about the heterogeneity in the data population, which is represented by the conditional distributions for the response given the covariates associated with a number of distinct but latent subpopulations. In this paper, we investigate conditions of strong identifiability, rates of convergence for conditional density and parameter estimation, and the Bayesian posterior contraction behavior arising in finite mixture of regression models, under exact-fitted and over-fitted settings and when the number of components is unknown. This theory is applicable to common choices of link functions and families of conditional distributions employed by practitioners. We provide simulation studies and data illustrations, which shed some light on the parameter learning behavior found in several popular regression mixture models reported in the literature.

Strong identifiability and parameter learning in regression with heterogeneous response

Abstract

Paper Structure (50 sections, 28 theorems, 244 equations, 4 figures, 1 table, 2 algorithms)

This paper contains 50 sections, 28 theorems, 244 equations, 4 figures, 1 table, 2 algorithms.

Introduction
Notation
Preliminaries
Regression mixture models
Wasserstein distances
Mixtures of conditional densities.
Key inequalities
Strong identifiability and inverse bounds
Conditions of strong identifiability
Characterization of strong identifiability
Inverse bounds for mixture of regression models
Consequences of lack of strong identifiability
Statistical efficiency in learning regression mixtures
Maximum (conditional) likelihood estimation
Bayesian posterior contraction theorems for parameter inference
...and 35 more sections

Key Result

Lemma 2.1

Assume conditions eq:uniform-Lipschitz-f and eq:uniform-Lipschitz-h hold. Then for every $G\in \mathcal{O}_{K}(\Theta)$ and $K\geq 1$, we have where the multiplicative constant in this inequality only depends on $c_{f}, c_1$, and $c_2$.

Figures (4)

Figure 5.1: Illustrations of the inverse bounds.
Figure 5.2: Convergence rates of parameters estimation in the exact-fitted and over-fitted setting.
Figure 5.3: Convergence rate in exact-fitted case where the model is non-strongly-identifiable.
Figure 5.4: The impact of Crash data being near pathological cases of non-strong identifiability.

Theorems & Definitions (61)

Lemma 2.1
Definition 3.1
Definition 3.2
Definition 3.3
Remark 3.1
Theorem 3.1
Proposition 3.1
Definition 3.4
Proposition 3.2
Proposition 3.3
...and 51 more

Strong identifiability and parameter learning in regression with heterogeneous response

Abstract

Strong identifiability and parameter learning in regression with heterogeneous response

Authors

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (61)