Table of Contents
Fetching ...

Kernel Smoothing for Bounded Copula Densities

Mathias N. Muia, Olivia Atutey, Mahmud Hasan

TL;DR

This work develops a two-stage, nonparametric kernel estimator for bivariate copula densities that remains well-behaved near the unit-square boundaries through a mirror-reflection technique. It provides rigorous bias-variance characterizations and proves uniform consistency and asymptotic normality, along with practical bandwidth selection strategies including AMISE-based rule-of-thumb and data-driven cross-validation (LSCV and BCV). Simulation and a real-data application to the Wisconsin Breast Cancer Diagnostic Dataset demonstrate that AMISE-optimal bandwidths perform robustly when margins have unbounded support, while LSCV offers asymptotic optimality and useful bandwidth tuning in practice. The study offers a principled framework for estimating bounded copula densities with boundary corrections, enabling reliable dependence modeling in applications where marginal supports vary in extent and shape.

Abstract

Nonparametric estimation of copula density functions using kernel estimators presents significant challenges. One issue is the potential unboundedness of certain copula density functions at the corners of the unit square. Another is the boundary bias inherent in kernel density estimation. This paper presents a kernel-based method for estimating bounded copula density functions, addressing boundary bias through the mirror-reflection technique. Optimal smoothing parameters are derived via Asymptotic Mean Integrated Squared Error (AMISE) minimization and cross-validation, with theoretical guarantees of consistency and asymptotic normality. Two kernel smoothing strategies are proposed: the rule-of-thumb approach and least squares cross-validation (LSCV). Simulation studies highlight the efficacy of the rule-of-thumb method in bandwidth selection for copulas with unbounded marginal supports. The methodology is further validated through an application to the Wisconsin Breast Cancer Diagnostic Dataset (WBCDD), where LSCV is used for bandwidth selection.

Kernel Smoothing for Bounded Copula Densities

TL;DR

This work develops a two-stage, nonparametric kernel estimator for bivariate copula densities that remains well-behaved near the unit-square boundaries through a mirror-reflection technique. It provides rigorous bias-variance characterizations and proves uniform consistency and asymptotic normality, along with practical bandwidth selection strategies including AMISE-based rule-of-thumb and data-driven cross-validation (LSCV and BCV). Simulation and a real-data application to the Wisconsin Breast Cancer Diagnostic Dataset demonstrate that AMISE-optimal bandwidths perform robustly when margins have unbounded support, while LSCV offers asymptotic optimality and useful bandwidth tuning in practice. The study offers a principled framework for estimating bounded copula densities with boundary corrections, enabling reliable dependence modeling in applications where marginal supports vary in extent and shape.

Abstract

Nonparametric estimation of copula density functions using kernel estimators presents significant challenges. One issue is the potential unboundedness of certain copula density functions at the corners of the unit square. Another is the boundary bias inherent in kernel density estimation. This paper presents a kernel-based method for estimating bounded copula density functions, addressing boundary bias through the mirror-reflection technique. Optimal smoothing parameters are derived via Asymptotic Mean Integrated Squared Error (AMISE) minimization and cross-validation, with theoretical guarantees of consistency and asymptotic normality. Two kernel smoothing strategies are proposed: the rule-of-thumb approach and least squares cross-validation (LSCV). Simulation studies highlight the efficacy of the rule-of-thumb method in bandwidth selection for copulas with unbounded marginal supports. The methodology is further validated through an application to the Wisconsin Breast Cancer Diagnostic Dataset (WBCDD), where LSCV is used for bandwidth selection.

Paper Structure

This paper contains 13 sections, 4 theorems, 48 equations, 6 figures, 1 table.

Key Result

Proposition 3.1

Let $c(u,v)$ be a copula density function that is twice-continuously differentiable on $[0,1]^2$. Assume that the smoothing parameter $h_n \to 0$ and $nh_n^2 \to \infty$ as $n \to \infty$. Then, for all points $(u,v) \in [0,1]^2$, the bias and variance of the estimator $\hat{c}(u,v)$, defined in (mu where $R(K)$ and $\mu_2(K)$ are as defined in l2norm and moments dfn respectively. Here, $c_{uu}$ a

Figures (6)

  • Figure 1: (a) Scatterplot of original data from the Frank copula sample and (b) scatterplot of the transformed sample.
  • Figure 2: (a) Contour plot of the true density; (b) Contour plot of the mirror reflection estimates on simulated data (n=500) of a Frank copula. Bandwidths are selected based on the LSCV criterion.
  • Figure 3: (a) Mirror reflection estimate for the Frank copula transformed sample; (b) perspective plot for the true density of the Frank copula, and (c) perspective plot of the mirror-reflection kernel density estimate.
  • Figure 4: (a) Scatter plot of data after transformation by its marginal empirical distribution function; (b) scatter plot of the transformed data with standard normal margins.
  • Figure 5: Exploratory visualizations of the WBCDD data and copula density: (a) Contour plot of the copula density combined with standard normal margins; (b) kernel density estimate for transformed sample (with standard normal margins); and (c) surface plot of the copula density.
  • ...and 1 more figures

Theorems & Definitions (9)

  • Definition 2.1
  • Proposition 3.1
  • proof
  • Theorem 4.1
  • proof
  • Lemma 4.2
  • proof
  • Proposition 4.3
  • proof