Deep Copula Classifier: Theory, Consistency, and Empirical Evaluation
Agnideep Aich, Ashit Baran Aich
TL;DR
The paper introduces the Deep Copula Classifier (DCC), a generative, class-conditional model that decouples marginal estimation from dependence modeling using neural copula densities. It offers theoretical guarantees, including Bayes-consistency and a convergence rate of $O(n^{-r/(2r+d)})$ for $r$-smooth copulas, under standard regularity conditions. Empirically, DCC demonstrates Bayes-aligned decision regions in synthetic experiments and competitive, well-calibrated performance on the PIMA diabetes dataset, surpassing several baselines on ROC-AUC and achieving calibration rivaling logistic regression. The work highlights the practical and theoretical value of modeling feature dependencies with neural copulas, and discusses extensions to high-dimensional, semi-supervised, and streaming settings. Overall, DCC provides a principled, interpretable, and scalable alternative to independence-based classifiers for tasks where dependency structure is critical.
Abstract
We present the Deep Copula Classifier (DCC), a class-conditional generative model that separates marginal estimation from dependence modeling using neural copula densities. DCC is interpretable, Bayes-consistent, and achieves excess-risk $O(n^{-r/(2r+d)})$ for $r$-smooth copulas. In a controlled two-class study with strong dependence ($|ρ|=0.995$), DCC learns Bayes-aligned decision regions. With oracle or pooled marginals, it nearly reaches the best possible performance (accuracy $\approx 0.971$; ROC-AUC $\approx 0.998$). As expected, per-class KDE marginals perform less well (accuracy $0.873$; ROC-AUC $0.957$; PR-AUC $0.966$). On the Pima Indians Diabetes dataset, calibrated DCC ($τ=1$) achieves accuracy $0.879$, ROC-AUC $0.936$, and PR-AUC $0.870$, outperforming Logistic Regression, SVM (RBF), and Naive Bayes, and matching Logistic Regression on the lowest Expected Calibration Error (ECE). Random Forest is also competitive (accuracy $0.892$; ROC-AUC $0.933$; PR-AUC $0.880$). Directly modeling feature dependence yields strong, well-calibrated performance with a clear probabilistic interpretation, making DCC a practical, theoretically grounded alternative to independence-based classifiers.
