Fully Heteroscedastic Count Regression with Deep Double Poisson Networks

Spencer Young; Porter Jenkins; Longchao Da; Jeff Dotson; Hua Wei

Fully Heteroscedastic Count Regression with Deep Double Poisson Networks

Spencer Young, Porter Jenkins, Longchao Da, Jeff Dotson, Hua Wei

TL;DR

The paper addresses count regression under input-dependent uncertainty by introducing the Deep Double Poisson Network (DDPN), which outputs the Double Poisson parameters $\mu$ and $\gamma$ to create fully heteroscedastic predictive distributions over nonnegative integers. DDPN includes learnable loss attenuation, with a discrete $\beta$-NLL variant that lets practitioners control attenuation strength and training dynamics, while ensuring robust mean fitting. The authors prove that DDPN satisfies full heteroscedasticity under moment approximations and demonstrate state-of-the-art accuracy, calibration (CRPS), and out-of-distribution detection across diverse real-world datasets, aided by ensembles. The work significantly improves reliability of probabilistic predictions for count data, enabling better decision-making in high-stakes domains, and provides practical guidance on training dynamics via the $\beta$ parameter.

Abstract

Neural networks capable of accurate, input-conditional uncertainty representation are essential for real-world AI systems. Deep ensembles of Gaussian networks have proven highly effective for continuous regression due to their ability to flexibly represent aleatoric uncertainty via unrestricted heteroscedastic variance, which in turn enables accurate epistemic uncertainty estimation. However, no analogous approach exists for count regression, despite many important applications. To address this gap, we propose the Deep Double Poisson Network (DDPN), a novel neural discrete count regression model that outputs the parameters of the Double Poisson distribution, enabling arbitrarily high or low predictive aleatoric uncertainty for count data and improving epistemic uncertainty estimation when ensembled. We formalize and prove that DDPN exhibits robust regression properties similar to heteroscedastic Gaussian models via learnable loss attenuation, and introduce a simple loss modification to control this behavior. Experiments on diverse datasets demonstrate that DDPN outperforms current baselines in accuracy, calibration, and out-of-distribution detection, establishing a new state-of-the-art in deep count regression.

Fully Heteroscedastic Count Regression with Deep Double Poisson Networks

TL;DR

The paper addresses count regression under input-dependent uncertainty by introducing the Deep Double Poisson Network (DDPN), which outputs the Double Poisson parameters

and

to create fully heteroscedastic predictive distributions over nonnegative integers. DDPN includes learnable loss attenuation, with a discrete

-NLL variant that lets practitioners control attenuation strength and training dynamics, while ensuring robust mean fitting. The authors prove that DDPN satisfies full heteroscedasticity under moment approximations and demonstrate state-of-the-art accuracy, calibration (CRPS), and out-of-distribution detection across diverse real-world datasets, aided by ensembles. The work significantly improves reliability of probabilistic predictions for count data, enabling better decision-making in high-stakes domains, and provides practical guidance on training dynamics via the

parameter.

Abstract

Paper Structure (64 sections, 5 theorems, 28 equations, 32 figures, 5 tables)

This paper contains 64 sections, 5 theorems, 28 equations, 32 figures, 5 tables.

Introduction
Our Contributions
Modeling Predictive Uncertainty with Neural Networks
Epistemic Uncertainty
Aleatoric Uncertainty
Heteroscedastic Regression in Deep Learning
Heteroscedastic Regression with Generalized Linear Models
Full Heteroscedasticity
Deep Double Poisson Networks (DDPN)
DDPN Objective
Loss Attenuation Dynamics of DDPN
$\beta$-DDPN: Controllable Loss Attentuation
DDPN Ensembles
Experiments
Evaluation Metrics
...and 49 more sections

Key Result

Proposition 2.3

Gaussian regressors are fully heteroscedastic, whereas Poisson and Negative Binomial regressors are not.

Figures (32)

Figure 1: Simulation experiment demonstrating a heteroscedastic data-generating process with discrete outputs. The predictive distributions of three ensemble methods are compared. The mean predictions are shown in black, the 95% credible intervals shaded in blue, along with the corresponding Mean Absolute Error (MAE) and Continuous Ranked Probability Score (CRPS) in the top left (Top). Uncertainty is decomposed into its aleatoric (Middle) and epistemic (Bottom) components, with the ground-truth aleatoric uncertainty represented by a dashed line. Existing discrete regression methods are unable to accurately capture aleatoric uncertainty, which impacts both epistemic and total predictive uncertainty. In contrast, ensembles comprising Deep Double Poisson Networks (DDPNs) effectively represent epistemic and aleatoric uncertainty, while also improving mean fit.
Figure 2: An overview of the Deep Double Poisson Network (DDPN). DDPN is a neural network that can process complex data and outputs the parameters of a Double Poisson distribution, $\hat{\mu}_i$ and $\hat{\gamma}_i$. The resulting predictive distributions exhibit unrestricted variance such that the network can learn over-, equi-, and under-dispersion. We ensemble a set of $M$ DDPNs to estimate aleatoric and epistemic uncertainty.
Figure 3: Even when explicitly misspecified, DDPN recovers the data-generating distribution.
Figure 4: Predictive PMFs from a $\beta_{1.0}$-DDPN ensemble on samples from the test split of COCO-People. Individual member predictions are in gray, while the ensemble prediction is in blue. DDPN is able to flexibly and accurately represent counts of various magnitudes.
Figure 5: Predictive CDFs from a $\beta_{0.5}$-Gaussian (blue) and $\beta_{0.5}$-DDPN (green) model on two test points from Length of Stay. Observed labels are marked with a vertical dashed line. When $y$ is discrete, Gaussian models' predictions suffer from having to assign probability mass to infeasible continuous values. DDPN is free from this constraint.
...and 27 more figures

Theorems & Definitions (15)

Definition 2.1
Definition 2.2
Proposition 2.3
Proposition 3.1
Definition 3.2
Proposition 3.3
Proposition 3.4
Definition 1.1
Proposition 1.2
proof
...and 5 more

Fully Heteroscedastic Count Regression with Deep Double Poisson Networks

TL;DR

Abstract

Fully Heteroscedastic Count Regression with Deep Double Poisson Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (32)

Theorems & Definitions (15)