MCD: Marginal Contrastive Discrimination for conditional density estimation
Katia Meziani, Aminata Ndiaye, Benjamin Riu
TL;DR
The paper tackles conditional density estimation (CDE) by introducing Marginal Contrastive Discrimination (MCD), which factorizes $f_{Y|X=x}(y)$ into the marginal density $f_Y(y)$ and a contrast term via $f_{Y|X=x}(y)= f_Y(y) * \left( q(x,y)/(1-q(x,y)) \right) * ((1-r)/r)$ where $q$ is defined as $q(x,y)=[ r f_{X,Y}(x,y) ] / [ r f_{X,Y}(x,y) + (1-r) f_X(x) f_Y(y) ]$. The method leverages supervised learning to estimate $q$ under the Marginal Discrimination Condition (MDcond) by constructing a dataset $(W,Z)$ such that $f_W|Z=1 = f_{X,Y}$ and $f_W|Z=0 = f_X f_Y$, enabling scalable CDE through standard ML tools. The authors provide theoretical constructions that allow building large i.i.d. or near-i.i.d. training sets from available data, including additional marginal data and multiple targets, and demonstrate strong empirical performance across a battery of density models and real datasets, often surpassing existing CDE methods. An open-source Python implementation and detailed ablation studies underscore the practical promise of MCD for high-dimensional conditional density estimation.
Abstract
We consider the problem of conditional density estimation, which is a major topic of interest in the fields of statistical and machine learning. Our method, called Marginal Contrastive Discrimination, MCD, reformulates the conditional density function into two factors, the marginal density function of the target variable and a ratio of density functions which can be estimated through binary classification. Like noise-contrastive methods, MCD can leverage state-of-the-art supervised learning techniques to perform conditional density estimation, including neural networks. Our benchmark reveals that our method significantly outperforms in practice existing methods on most density models and regression datasets.
