A Likelihood Based Approach to Distribution Regression Using Conditional Deep Generative Models

Shivam Kumar; Yun Yang; Lizhen Lin

A Likelihood Based Approach to Distribution Regression Using Conditional Deep Generative Models

Shivam Kumar, Yun Yang, Lizhen Lin

TL;DR

This work analyzes conditional distribution regression using a likelihood-based sieve MLE for conditional deep generative models, addressing high-dimensional responses concentrated on low-dimensional manifolds with ambient noise. The authors derive convergence rates in $d_H$ for the conditional density and in $W_r$ for the intrinsic conditional distributions, showing that these rates depend primarily on intrinsic dimension and smoothness, not ambient dimensionality. A key practical insight is the benefit of injecting small noise when data lie near manifolds, which improves estimator stability and convergence. Numerical experiments on synthetic data, manifolds, and MNIST validate the theory and demonstrate competitive performance of sparse and dense neural sieve classes against established conditional density methods. Overall, the paper provides a principled statistical foundation for using conditional deep generative models in distribution regression and highlights their potential to exploit intrinsic geometry for scalable learning.

Abstract

In this work, we explore the theoretical properties of conditional deep generative models under the statistical framework of distribution regression where the response variable lies in a high-dimensional ambient space but concentrates around a potentially lower-dimensional manifold. More specifically, we study the large-sample properties of a likelihood-based approach for estimating these models. Our results lead to the convergence rate of a sieve maximum likelihood estimator (MLE) for estimating the conditional distribution (and its devolved counterpart) of the response given predictors in the Hellinger (Wasserstein) metric. Our rates depend solely on the intrinsic dimension and smoothness of the true conditional distribution. These findings provide an explanation of why conditional deep generative models can circumvent the curse of dimensionality from the perspective of statistical foundations and demonstrate that they can learn a broader class of nearly singular conditional distributions. Our analysis also emphasizes the importance of introducing a small noise perturbation to the data when they are supported sufficiently close to a manifold. Finally, in our numerical studies, we demonstrate the effective implementation of the proposed approach using both synthetic and real-world datasets, which also provide complementary validation to our theoretical findings.

A Likelihood Based Approach to Distribution Regression Using Conditional Deep Generative Models

TL;DR

for the conditional density and in

for the intrinsic conditional distributions, showing that these rates depend primarily on intrinsic dimension and smoothness, not ambient dimensionality. A key practical insight is the benefit of injecting small noise when data lie near manifolds, which improves estimator stability and convergence. Numerical experiments on synthetic data, manifolds, and MNIST validate the theory and demonstrate competitive performance of sparse and dense neural sieve classes against established conditional density methods. Overall, the paper provides a principled statistical foundation for using conditional deep generative models in distribution regression and highlights their potential to exploit intrinsic geometry for scalable learning.

Abstract

Paper Structure (29 sections, 16 theorems, 105 equations, 2 figures, 3 tables)

This paper contains 29 sections, 16 theorems, 105 equations, 2 figures, 3 tables.

Introduction
List of contributions
Other relevant literature
Conditional deep generative models for distribution regression
Convergence rates of the Sieve MLE
Neural network class
Wasserstein convergence of the intrinsic (conditional) distributions
Characterization of the learnable distribution class
Smooth conditional density
A broader conditional distribution class with smoothness disparity
Conditional distribution on manifolds
Numerical Results
Discussion
Additional numerical results
Numerical result for real data
...and 14 more sections

Key Result

Lemma 1

Let $\mathcal{F}$ be class of functions from ${\mathcal{Z}} \times {\mathcal{X}}$ to $\mathbb{R}^D$ such that $\||g|_\infty\|_\infty \le K$ for every $g \in \mathcal{F}$. Let ${\mathcal{P}} = \left\{ P_{g,\sigma}: g\in \mathcal{F}, \sigma \in [\sigma_{\min}, \sigma_{\max}] \right\}$ with $\sigma_{\m

Figures (2)

Figure 1: Generated samples from manifold $M_1$ and $M_2$ are displayed in the left panel. The right panel shows box plots for the empirical Wasserstein distance at different noise levels $\sigma_*$.
Figure 2: MNIST images: real images (left panel), generated images with sparse architecture (central panel), and generated images with fully connected architecture (right panel)

Theorems & Definitions (29)

Remark 1: Strength of the Composite structure
Example 1: One‐dimensional $\beta$-Hölder Generator
Lemma 1
Theorem 1
Corollary 1
Remark 2
Theorem 2
Corollary 2
Lemma 2
Theorem 3
...and 19 more

A Likelihood Based Approach to Distribution Regression Using Conditional Deep Generative Models

TL;DR

Abstract

A Likelihood Based Approach to Distribution Regression Using Conditional Deep Generative Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (29)