Table of Contents
Fetching ...

A Generative Approach to Quasi-Random Sampling from Copulas via Space-Filling Designs

Sumin Wang, Chenxian Huang, Yongdao Zhou, Min-Qian Liu

TL;DR

This paper tackles the challenge of generating high-quality quasi-random samples from general copulas. It introduces a generative-adversarial-network framework to learn the copula transport map $\phi_C$ and then uses space-filling designs to push low-dimensional uniform inputs through the learned map, yielding efficient quasi-Monte Carlo samples for high-dimensional copulas. The authors establish theoretical guarantees for the learned map and the resulting estimators, including convergence rates for bias and variance, and demonstrate superior performance to CDM and GMMN in simulations and a real-data risk-management application. The approach achieves substantial variance reduction and scalable sampling across dimensions, with clear practical implications for numerical integration and financial risk assessment.

Abstract

Exploring the dependence between covariates across distributions is crucial for many applications. Copulas serve as a powerful tool for modeling joint variable dependencies and have been effectively applied in various practical contexts due to their intuitive properties. However, existing computational methods lack the capability for feasible inference and sampling of any copula, preventing their widespread use. This paper introduces an innovative quasi-random sampling approach for copulas, utilizing generative adversarial networks (GANs) and space-filling designs. The proposed framework constructs a direct mapping from low-dimensional uniform distributions to high-dimensional copula structures using GANs, and generates quasi-random samples for any copula structure from points set of space-filling designs. In the high-dimensional situations with limited data, the proposed approach significantly enhances sampling accuracy and computational efficiency compared to existing methods. Additionally, we develop convergence rate theory for quasi-Monte Carlo estimators, providing rigorous upper bounds for bias and variance. Both simulated experiments and practical implementations, particularly in risk management, validate the proposed method and showcase its superiority over existing alternatives.

A Generative Approach to Quasi-Random Sampling from Copulas via Space-Filling Designs

TL;DR

This paper tackles the challenge of generating high-quality quasi-random samples from general copulas. It introduces a generative-adversarial-network framework to learn the copula transport map and then uses space-filling designs to push low-dimensional uniform inputs through the learned map, yielding efficient quasi-Monte Carlo samples for high-dimensional copulas. The authors establish theoretical guarantees for the learned map and the resulting estimators, including convergence rates for bias and variance, and demonstrate superior performance to CDM and GMMN in simulations and a real-data risk-management application. The approach achieves substantial variance reduction and scalable sampling across dimensions, with clear practical implications for numerical integration and financial risk assessment.

Abstract

Exploring the dependence between covariates across distributions is crucial for many applications. Copulas serve as a powerful tool for modeling joint variable dependencies and have been effectively applied in various practical contexts due to their intuitive properties. However, existing computational methods lack the capability for feasible inference and sampling of any copula, preventing their widespread use. This paper introduces an innovative quasi-random sampling approach for copulas, utilizing generative adversarial networks (GANs) and space-filling designs. The proposed framework constructs a direct mapping from low-dimensional uniform distributions to high-dimensional copula structures using GANs, and generates quasi-random samples for any copula structure from points set of space-filling designs. In the high-dimensional situations with limited data, the proposed approach significantly enhances sampling accuracy and computational efficiency compared to existing methods. Additionally, we develop convergence rate theory for quasi-Monte Carlo estimators, providing rigorous upper bounds for bias and variance. Both simulated experiments and practical implementations, particularly in risk management, validate the proposed method and showcase its superiority over existing alternatives.
Paper Structure (17 sections, 9 theorems, 88 equations, 5 figures, 2 algorithms)

This paper contains 17 sections, 9 theorems, 88 equations, 5 figures, 2 algorithms.

Key Result

Theorem 1

Given $N$ samples $\{\boldsymbol{u}_i\}_{i=1}^{N}$ for training a GAN, and assuming that Assumptions (A.1)--(A.4) hold, along with the condition $N^{1/(2+d)}>d(\log N )^{1/d}$, we further make the following assumptions: (a) the target copula $C$ is supported on $[0,1]^d$, and (b) the source distr where In addition, if $\| \Psi\|_{\infty}< \infty$, we can obtain

Figures (5)

  • Figure 1: Quasi-random samples obtained by CDM and GAN, all of size $n = 1000$, from a bivariate Marshall--Olkin copula (left), a three-dimension Clayton copula (middle) and a three-dimension Gumbel copula (right).
  • Figure 2: Boxplots based on $B=100$ realizations of the statistic $S_{n}$ (lower values indicate better), constructed for three different methods: (i) the CDM, (ii) GANs with two input types, and (iii) the GMMN. All boxplots correspond to a sample size of $n=1000$. Results are displayed for a bivariate Marshall--Olkin copula (left, $d=2$), a three-dimensional Clayton copula (middle, $d=3$), and a three-dimensional Gumbel copula (right, $d=3$).
  • Figure 3: Standard deviation estimates computed using $B = 25$ replications for estimating $\operatorname{ES}_{0.99}(S)$ with CDM, GANs, and GMMN estimators, presented for a three-dimensional Clayton copula (left), a three-dimensional Gumbel copula (middle), and a bivariate Marshall--Olkin copula (right).
  • Figure 4: Boxplots based on $B=20$ realization of $S_{N, n}$, computed from (i) the CDM method, (ii) GANs with two types of inputs, and (iii) the GMMN method--all of the size $n=1000$ samples--for $d=10$ (left), and $d=20$ (right).
  • Figure 5: Standard deviation estimates derived from $B = 20$ replications for calculating the $\operatorname{ES}_{0.99}(S)$ estimates using CDM estimators, GANs estimators, and GMMN estimators - for dimensions $d=10$ (left) and $d=20$ (right).

Theorems & Definitions (16)

  • Definition 1
  • Definition 2
  • Definition 3
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Lemma S1: zhou2023deep, zhou2023deep
  • Lemma S2: zhou2023deep, zhou2023deep
  • Lemma S3: shen2020, shen2020
  • Lemma S4
  • ...and 6 more