Table of Contents
Fetching ...

Novel Tau-Informed Initialization for Maximum Likelihood Estimation of Copulas with Discrete Margins

Anna van Es, Eva Cantoni

TL;DR

This paper tackles exact maximum likelihood (ML) estimation for Gaussian copulas with discrete margins in low-count settings, where identifiability and numerical stability pose challenges. It introduces three Kendall's tau–based initializers embedded in an IFM-inspired start, and employs an unconstrained reparameterization with exact rectangle probabilities and analytical gradients to stabilize Newton-type optimization of the log-likelihood $\ell$. Simulations across dimensions and count regimes show that a tau-based initializer (Option 1) with exact ML achieves lower RMSE and bias and faster convergence than alternatives, with analytic gradients delivering superior accuracy and speed, especially as $d$ grows. The methodology preserves ML's statistical guarantees while remaining tractable for moderate- to high-dimensional discrete data, and provides practical guidance on initializer choice and extensions to other margins and copula families.

Abstract

We study Gaussian-copula models with discrete margins, with primary emphasis on low-count (Poisson) data. Our goal is exact yet computationally efficient maximum likelihood (ML) estimation in regimes where many observations contain small counts, which imperils both identifiability and numerical stability. We develop three novel Kendall's tau-based approaches for initialization tailored to discrete margins in the low-count regime and embed it within an inference functions for margins (IFM) inspired start. We present three practical initializers (exact, low-intensity approximation, and a transformation-based approach) that substantially reduce the number of ML iterations and improve convergence. For the ML stage, we use an unconstrained reparameterization of the model's parameters using the log and spherical-Cholesky and compute exact rectangle probabilities. Analytical score functions are supplied throughout to stabilize Newton-type optimization. A simulation study across dimensions, dependence levels, and intensity regimes shows that the proposed initialization combined with exact ML achieves lower root-mean-squared error, lower bias and faster computation times than the alternative procedures. The methodology provides a pragmatic path to retain the statistical guarantees of ML (consistency, asymptotic normality, efficiency under correct specification) while remaining tractable for moderate- to high-dimensional discrete data. We conclude with guidance on initializer choice and discuss extensions to alternative correlation structures and different margins.

Novel Tau-Informed Initialization for Maximum Likelihood Estimation of Copulas with Discrete Margins

TL;DR

This paper tackles exact maximum likelihood (ML) estimation for Gaussian copulas with discrete margins in low-count settings, where identifiability and numerical stability pose challenges. It introduces three Kendall's tau–based initializers embedded in an IFM-inspired start, and employs an unconstrained reparameterization with exact rectangle probabilities and analytical gradients to stabilize Newton-type optimization of the log-likelihood . Simulations across dimensions and count regimes show that a tau-based initializer (Option 1) with exact ML achieves lower RMSE and bias and faster convergence than alternatives, with analytic gradients delivering superior accuracy and speed, especially as grows. The methodology preserves ML's statistical guarantees while remaining tractable for moderate- to high-dimensional discrete data, and provides practical guidance on initializer choice and extensions to other margins and copula families.

Abstract

We study Gaussian-copula models with discrete margins, with primary emphasis on low-count (Poisson) data. Our goal is exact yet computationally efficient maximum likelihood (ML) estimation in regimes where many observations contain small counts, which imperils both identifiability and numerical stability. We develop three novel Kendall's tau-based approaches for initialization tailored to discrete margins in the low-count regime and embed it within an inference functions for margins (IFM) inspired start. We present three practical initializers (exact, low-intensity approximation, and a transformation-based approach) that substantially reduce the number of ML iterations and improve convergence. For the ML stage, we use an unconstrained reparameterization of the model's parameters using the log and spherical-Cholesky and compute exact rectangle probabilities. Analytical score functions are supplied throughout to stabilize Newton-type optimization. A simulation study across dimensions, dependence levels, and intensity regimes shows that the proposed initialization combined with exact ML achieves lower root-mean-squared error, lower bias and faster computation times than the alternative procedures. The methodology provides a pragmatic path to retain the statistical guarantees of ML (consistency, asymptotic normality, efficiency under correct specification) while remaining tractable for moderate- to high-dimensional discrete data. We conclude with guidance on initializer choice and discuss extensions to alternative correlation structures and different margins.

Paper Structure

This paper contains 27 sections, 7 theorems, 87 equations, 9 figures, 4 tables, 1 algorithm.

Key Result

Proposition 2.1

Let $(X, Y)$ be integer-valued bivariate random vector with marginal distributions $F_X \sim \operatorname{Po}\left(\lambda_X\right), F_Y \sim \operatorname{Po}\left(\lambda_Y\right)$, and let $C_\rho$ be a copula inducing the joint law $H$ of $(X, Y)$ (unique only on $\mathop{\mathrm{Ran}}\nolimits Then: Moreover, if $\psi_{\text{Sk }}$ denotes the Pearson-copula link mapping the family paramete

Figures (9)

  • Figure 1: The relationship between the values of $\lambda, \rho$ and RMSE for the three different approaches.
  • Figure 2: The relationship between $\rho^{*}$ and $\rho$ for different values of $\lambda_X, \lambda_Y$.
  • Figure 3: The relationship between $\frac{2}{ \pi} \arcsin (\rho)$ and corrected Kendall's tau.
  • Figure 4: Violin plots for $\boldsymbol{\lambda} = (0.5,1)$ for different scenarios.
  • Figure 5: Violin plots for $\boldsymbol{\lambda} = (2,3)$ for different scenarios.
  • ...and 4 more figures

Theorems & Definitions (16)

  • Definition 2.1: Estimator of Kendall's tau for zero-inflated count data from perroneKendallTauEstimator2023
  • Proposition 2.1
  • Proposition 2.2: Kendall's tau approximation in arcsine form
  • Proposition 3.1: Gradient of the log-likelihood of a Gaussian copula with Poisson margins.
  • Definition A.1: Skellam distribution, skellamFrequencyDistributionDifference1946
  • Lemma A.1
  • Definition A.2: Definition $2.2.1$ in nelsenIntroductionCopulas2006
  • proof : Proof of Lemma \ref{['lem:conv-copula-discrete']}
  • Lemma A.2
  • proof : Proof of Lemma \ref{['lem:tie-terms']}
  • ...and 6 more