Table of Contents
Fetching ...

Tukey g-and-h neural network regression for non-Gaussian data

Arthur P. Guillaumin, Natalia Efremova

TL;DR

This work considers the training of a neural network to predict the parameters of a Tukey g-and-h distribution in a regression framework via the minimization of the corresponding negative log-likelihood, despite the latter having no closed-form expression.

Abstract

This paper addresses non-Gaussian regression with neural networks via the use of the Tukey g-and-h distribution.The Tukey g-and-h transform is a flexible parametric transform with two parameters $g$ and $h$ which, when applied to a standard normal random variable, introduces both skewness and kurtosis, resulting in a distribution commonly called the Tukey g-and-h distribution. Specific values of $g$ and $h$ produce good approximations to other families of distributions, such as the Cauchy and student-t distributions. The flexibility of the Tukey g-and-h distribution has driven its popularity in the statistical community, in applied sciences and finance. In this work we consider the training of a neural network to predict the parameters of a Tukey g-and-h distribution in a regression framework via the minimization of the corresponding negative log-likelihood, despite the latter having no closed-form expression. We demonstrate the efficiency of our procedure in simulated examples and apply our method to a real-world dataset of global crop yield for several types of crops. Finally, we show how we can carry out a goodness-of-fit analysis between the predicted distributions and the test data. A Pytorch implementation is made available on Github and as a Pypi package.

Tukey g-and-h neural network regression for non-Gaussian data

TL;DR

This work considers the training of a neural network to predict the parameters of a Tukey g-and-h distribution in a regression framework via the minimization of the corresponding negative log-likelihood, despite the latter having no closed-form expression.

Abstract

This paper addresses non-Gaussian regression with neural networks via the use of the Tukey g-and-h distribution.The Tukey g-and-h transform is a flexible parametric transform with two parameters and which, when applied to a standard normal random variable, introduces both skewness and kurtosis, resulting in a distribution commonly called the Tukey g-and-h distribution. Specific values of and produce good approximations to other families of distributions, such as the Cauchy and student-t distributions. The flexibility of the Tukey g-and-h distribution has driven its popularity in the statistical community, in applied sciences and finance. In this work we consider the training of a neural network to predict the parameters of a Tukey g-and-h distribution in a regression framework via the minimization of the corresponding negative log-likelihood, despite the latter having no closed-form expression. We demonstrate the efficiency of our procedure in simulated examples and apply our method to a real-world dataset of global crop yield for several types of crops. Finally, we show how we can carry out a goodness-of-fit analysis between the predicted distributions and the test data. A Pytorch implementation is made available on Github and as a Pypi package.

Paper Structure

This paper contains 15 sections, 17 equations, 13 figures.

Figures (13)

  • Figure 1: Comparison of the probability density function of a Tukey g-and-h distributed random variable with that of a standard normal random variable, for different values of $g$ and $h$.
  • Figure 2: True regression functions (solid) and trained regression functions (dashed) on 10000 observations.
  • Figure 4: Comparison of true (blue) and fitted (Tukey g-and-h in orange vs Gaussian in green) probability density functions at four values of the scalar feature $x$, in the case where the target variable follows a t-distribution.
  • Figure 5: Tukey g-and-h (orange) and Gaussian (green) losses over training epochs for training (solid) and validation (dashed) data for a t-distributed target variable.
  • Figure 6: Global yield of maize in ton per hectare in 2010 on a 0.5' spatial-resolution grid
  • ...and 8 more figures