Mean and quantile regression in the copula setting: properties, sharp bounds and a note on estimation
Henrik Kaiser, Wolfgang Trutschnig
TL;DR
The paper investigates how uniform marginals on $[0,1]$ constrain mean and quantile regression in a copula framework, deriving sharp, dimension-aware bounds for the $L^p$-deviation of the mean regression from $\tfrac{1}{2}$ and for the distribution of large deviations. It extends these results to quantile regression, proving tight bounds for the average quantile function and establishing a corresponding $D_{A,p}$ metric that governs regression convergence. Key findings include that the maximal $L^p$-deviation of mean regression is $\tfrac{1}{2}(p+1)^{-1/p}$, attained by completely dependent copulas, and that the average quantile satisfies $\int Q_C^\tau(\mathbf{x}) \, d\mu_A(\mathbf{x}) \in [\tfrac{\tau}{2}, \tfrac{\tau+1}{2}]$ with sharp bounds. The paper also proves strong consistency of the empirical checkerboard estimator for both mean and quantile regression in the bivariate setting, providing practical guarantees for nonparametric copula-based regression estimation and highlighting caveats for simplifying assumptions in pair copula constructions.
Abstract
Driven by the interest on how uniformity of marginal distributions propa\-gates to properties of regression functions, in this contribution we tackle the following questions: Given a $(d-1)$-dimensional random vector $\textbf{X}$ and a random variable $Y$ such that all univariate marginals of $(\textbf{X},Y)$ are uniformly distributed on $[0,1]$, how large can the average absolute deviation of the mean and the quantile regression function of $Y$ given $\textbf{X}$ from the value $\frac{1}{2}$ be, and how much mass may sets with large deviation have? We answer these questions by deriving sharp inequalities, both in the mean as well as in the quantile setting, and sketch some cautionary consequences to nowadays quite popular pair copula constructions involving the so-called simplifying assumption. Rounding off our results, working with the so-called empirical checkerboard estimator in the bivariate setting, we show strong consistency for both regression types and illustrate the speed of convergence in terms of a simulation study.
