Table of Contents
Fetching ...

Semiparametric copula-based quantile regression for semicontinuous outcomes with application to healthcare data

Guanjie Lyu, Mohamed Belalia, Abdulkadir Hussein

Abstract

A semiparametric copula-based two-part quantile regression framework is developed for the analysis of semicontinuous outcomes characterized by a point mass at zero and a continuous positive component. The proposed approach models the occurrence and magnitude processes separately and links them through copula-based conditional distributions, allowing for flexible dependence structures and nonlinear covariate effects across quantiles. Large-sample properties of the resulting estimator are established, and extensive simulation studies demonstrate improved finite-sample performance relative to logistic/linear quantile regression, particularly under nonlinear dependence and substantial zero inflation. An application to healthcare data illustrates how the proposed method provides a nuanced characterization of the association between social deprivation and uncompensated and charity care burdens, revealing heterogeneous and nonlinear effects that are not captured by competing approaches.

Semiparametric copula-based quantile regression for semicontinuous outcomes with application to healthcare data

Abstract

A semiparametric copula-based two-part quantile regression framework is developed for the analysis of semicontinuous outcomes characterized by a point mass at zero and a continuous positive component. The proposed approach models the occurrence and magnitude processes separately and links them through copula-based conditional distributions, allowing for flexible dependence structures and nonlinear covariate effects across quantiles. Large-sample properties of the resulting estimator are established, and extensive simulation studies demonstrate improved finite-sample performance relative to logistic/linear quantile regression, particularly under nonlinear dependence and substantial zero inflation. An application to healthcare data illustrates how the proposed method provides a nuanced characterization of the association between social deprivation and uncompensated and charity care burdens, revealing heterogeneous and nonlinear effects that are not captured by competing approaches.
Paper Structure (14 sections, 1 theorem, 22 equations, 2 figures, 5 tables, 1 algorithm)

This paper contains 14 sections, 1 theorem, 22 equations, 2 figures, 5 tables, 1 algorithm.

Key Result

Theorem 2.2

Under ass:2024-06-10, 10:11AM, we have, for $\tau \in (0, 1)$ and $\bm{x}\in {\mathbb{R}}^d$, where "$\xrightarrow[n\to \infty]{P}$" stands for convergence in probability.

Figures (2)

  • Figure 1: Uncompensated care burden versus SDI score, all pointwise confidence bands are derived from $300$ bootstrap samples. TopLeft: The estimated probability of occurrence for uncompensated care with logistic regression (green dash line) and copula binary model (red solid line); TopRight: The quantile estimates of level $50\%$ for linear quantile regression (orange dotted line), logistic/linear quantile regression (green dash line) and copula-based two-part model (red solid line); BottomLeft: The quantile estimates of level $70\%$ for the three models; BottomRight: The quantile estimates of level $90\%$ for the three models.
  • Figure 2: Charity care burden versus SDI score, all pointwise confidence bands are derived from $300$ bootstrap samples. TopLeft: The estimated probability of occurrence for charity care with logistic regression (green dash line) and copula binary model (red solid line); TopRight: The quantile estimates of level $50\%$ for linear quantile regression (orange dotted line), logistic/linear quantile regression (green dash line) and copula-based two-part model (red solid line); BottomLeft: The quantile estimates of level $70\%$ for the three models; BottomRight: The quantile estimates of level $90\%$ for the three models.

Theorems & Definitions (2)

  • Theorem 2.2
  • proof