Mixture Density Networks for Classification with an Application to Product Bundling
Narendhar Gugulothu, Sanjay P. Bhat, Tejas Bodas
TL;DR
This work introduces two MDN-based classifiers (MDN-C1 and MDN-C2) that learn Gaussian mixtures and classify samples by leveraging learned CDFs, addressing the gap of MDN applicability to classification. The models are trained end-to-end with a cross-entropy objective and, in one variant, an L1 penalty to promote sparsity among mixture components. Beyond benchmarking on three datasets where they achieve competitive performance, the authors demonstrate a compelling product bundling application: learning product-level WTP distributions from sales data and computing the bundle WTP via convolution to support revenue optimization. The study highlights the practical utility of MDNs for distribution-aware classification and showcases how learned distributions can drive decision-making in bundling scenarios, with suggestions for future extension to alternate mixture families and richer data sources.
Abstract
While mixture density networks (MDNs) have been extensively used for regression tasks, they have not been used much for classification tasks. One reason for this is that the usability of MDNs for classification is not clear and straightforward. In this paper, we propose two MDN-based models for classification tasks. Both models fit mixtures of Gaussians to the the data and use the fitted distributions to classify a given sample by evaluating the learnt cumulative distribution function for the given input features. While the proposed MDN-based models perform slightly better than, or on par with, five baseline classification models on three publicly available datasets, the real utility of our models comes out through a real-world product bundling application. Specifically, we use our MDN-based models to learn the willingness-to-pay (WTP) distributions for two products from synthetic sales data of the individual products. The Gaussian mixture representation of the learnt WTP distributions is then exploited to obtain the WTP distribution of the bundle consisting of both the products. The proposed MDN-based models are able to approximate the true WTP distributions of both products and the bundle well.
