Table of Contents
Fetching ...

Fully Heteroscedastic Count Regression with Deep Double Poisson Networks

Spencer Young, Porter Jenkins, Longchao Da, Jeff Dotson, Hua Wei

TL;DR

The paper addresses count regression under input-dependent uncertainty by introducing the Deep Double Poisson Network (DDPN), which outputs the Double Poisson parameters $\mu$ and $\gamma$ to create fully heteroscedastic predictive distributions over nonnegative integers. DDPN includes learnable loss attenuation, with a discrete $\beta$-NLL variant that lets practitioners control attenuation strength and training dynamics, while ensuring robust mean fitting. The authors prove that DDPN satisfies full heteroscedasticity under moment approximations and demonstrate state-of-the-art accuracy, calibration (CRPS), and out-of-distribution detection across diverse real-world datasets, aided by ensembles. The work significantly improves reliability of probabilistic predictions for count data, enabling better decision-making in high-stakes domains, and provides practical guidance on training dynamics via the $\beta$ parameter.

Abstract

Neural networks capable of accurate, input-conditional uncertainty representation are essential for real-world AI systems. Deep ensembles of Gaussian networks have proven highly effective for continuous regression due to their ability to flexibly represent aleatoric uncertainty via unrestricted heteroscedastic variance, which in turn enables accurate epistemic uncertainty estimation. However, no analogous approach exists for count regression, despite many important applications. To address this gap, we propose the Deep Double Poisson Network (DDPN), a novel neural discrete count regression model that outputs the parameters of the Double Poisson distribution, enabling arbitrarily high or low predictive aleatoric uncertainty for count data and improving epistemic uncertainty estimation when ensembled. We formalize and prove that DDPN exhibits robust regression properties similar to heteroscedastic Gaussian models via learnable loss attenuation, and introduce a simple loss modification to control this behavior. Experiments on diverse datasets demonstrate that DDPN outperforms current baselines in accuracy, calibration, and out-of-distribution detection, establishing a new state-of-the-art in deep count regression.

Fully Heteroscedastic Count Regression with Deep Double Poisson Networks

TL;DR

The paper addresses count regression under input-dependent uncertainty by introducing the Deep Double Poisson Network (DDPN), which outputs the Double Poisson parameters and to create fully heteroscedastic predictive distributions over nonnegative integers. DDPN includes learnable loss attenuation, with a discrete -NLL variant that lets practitioners control attenuation strength and training dynamics, while ensuring robust mean fitting. The authors prove that DDPN satisfies full heteroscedasticity under moment approximations and demonstrate state-of-the-art accuracy, calibration (CRPS), and out-of-distribution detection across diverse real-world datasets, aided by ensembles. The work significantly improves reliability of probabilistic predictions for count data, enabling better decision-making in high-stakes domains, and provides practical guidance on training dynamics via the parameter.

Abstract

Neural networks capable of accurate, input-conditional uncertainty representation are essential for real-world AI systems. Deep ensembles of Gaussian networks have proven highly effective for continuous regression due to their ability to flexibly represent aleatoric uncertainty via unrestricted heteroscedastic variance, which in turn enables accurate epistemic uncertainty estimation. However, no analogous approach exists for count regression, despite many important applications. To address this gap, we propose the Deep Double Poisson Network (DDPN), a novel neural discrete count regression model that outputs the parameters of the Double Poisson distribution, enabling arbitrarily high or low predictive aleatoric uncertainty for count data and improving epistemic uncertainty estimation when ensembled. We formalize and prove that DDPN exhibits robust regression properties similar to heteroscedastic Gaussian models via learnable loss attenuation, and introduce a simple loss modification to control this behavior. Experiments on diverse datasets demonstrate that DDPN outperforms current baselines in accuracy, calibration, and out-of-distribution detection, establishing a new state-of-the-art in deep count regression.
Paper Structure (64 sections, 5 theorems, 28 equations, 32 figures, 5 tables)

This paper contains 64 sections, 5 theorems, 28 equations, 32 figures, 5 tables.

Key Result

Proposition 2.3

Gaussian regressors are fully heteroscedastic, whereas Poisson and Negative Binomial regressors are not.

Figures (32)

  • Figure 1: Simulation experiment demonstrating a heteroscedastic data-generating process with discrete outputs. The predictive distributions of three ensemble methods are compared. The mean predictions are shown in black, the 95% credible intervals shaded in blue, along with the corresponding Mean Absolute Error (MAE) and Continuous Ranked Probability Score (CRPS) in the top left (Top). Uncertainty is decomposed into its aleatoric (Middle) and epistemic (Bottom) components, with the ground-truth aleatoric uncertainty represented by a dashed line. Existing discrete regression methods are unable to accurately capture aleatoric uncertainty, which impacts both epistemic and total predictive uncertainty. In contrast, ensembles comprising Deep Double Poisson Networks (DDPNs) effectively represent epistemic and aleatoric uncertainty, while also improving mean fit.
  • Figure 2: An overview of the Deep Double Poisson Network (DDPN). DDPN is a neural network that can process complex data and outputs the parameters of a Double Poisson distribution, $\hat{\mu}_i$ and $\hat{\gamma}_i$. The resulting predictive distributions exhibit unrestricted variance such that the network can learn over-, equi-, and under-dispersion. We ensemble a set of $M$ DDPNs to estimate aleatoric and epistemic uncertainty.
  • Figure 3: Even when explicitly misspecified, DDPN recovers the data-generating distribution.
  • Figure 4: Predictive PMFs from a $\beta_{1.0}$-DDPN ensemble on samples from the test split of COCO-People. Individual member predictions are in gray, while the ensemble prediction is in blue. DDPN is able to flexibly and accurately represent counts of various magnitudes.
  • Figure 5: Predictive CDFs from a $\beta_{0.5}$-Gaussian (blue) and $\beta_{0.5}$-DDPN (green) model on two test points from Length of Stay. Observed labels are marked with a vertical dashed line. When $y$ is discrete, Gaussian models' predictions suffer from having to assign probability mass to infeasible continuous values. DDPN is free from this constraint.
  • ...and 27 more figures

Theorems & Definitions (15)

  • Definition 2.1
  • Definition 2.2
  • Proposition 2.3
  • Proposition 3.1
  • Definition 3.2
  • Proposition 3.3
  • Proposition 3.4
  • Definition 1.1
  • Proposition 1.2
  • proof
  • ...and 5 more