Table of Contents
Fetching ...

Binary AddiVortes: (Bayesian) Additive Voronoi Tessellations for Binary Classification with an application to Predicting Home Mortgage Application Outcomes

Adam J. Stone, Emmanuel Ogundimu, John Paul Gosling

TL;DR

This work extends the AddiVortes framework to binary classification by embedding a probit latent-variable model within a sum-of-tessellations design, enabling probabilistic predictions and uncertainty quantification for binary outcomes. Through data augmentation and Bayesian backfitting, the method captures complex, local covariate interactions via multiple Voronoi tessellations while applying regularization to prevent overfitting. Empirical results on benchmark binary datasets and a mortgage-approval application show AddiVortes frequently achieves superior AUC and competitive accuracy relative to RF, BART, and XGBoost, with notable interpretability through variable inclusion and posterior intervals. The mortgage analysis demonstrates practical impact for financial decision-making, combining strong predictive performance with transparent, region-specific influence of covariates, and the approach is positioned for extensions to multinomial and time-to-event contexts.

Abstract

The Additive Voronoi Tessellations (AddiVortes) model is a multivariate regression model that uses multiple Voronoi tessellations to partition the covariate space for an additive ensemble model. In this paper, the AddiVortes framework is extended to binary classification by incorporating a probit model with a latent variable formulation. Specifically, we utilise a data augmentation technique, where a latent variable is introduced and the binary response is determined via thresholding. In most cases, the AddiVortes model outperforms random forests, BART and other leading black-box regression models when compared using a range of metrics. A comprehensive analysis is conducted using AddiVortes to predict an individual's likelihood of being approved for a home mortgage, based on a range of covariates. This evaluation highlights the model's effectiveness in capturing complex relationships within the data and its potential for improving decision-making in mortgage approval processes.

Binary AddiVortes: (Bayesian) Additive Voronoi Tessellations for Binary Classification with an application to Predicting Home Mortgage Application Outcomes

TL;DR

This work extends the AddiVortes framework to binary classification by embedding a probit latent-variable model within a sum-of-tessellations design, enabling probabilistic predictions and uncertainty quantification for binary outcomes. Through data augmentation and Bayesian backfitting, the method captures complex, local covariate interactions via multiple Voronoi tessellations while applying regularization to prevent overfitting. Empirical results on benchmark binary datasets and a mortgage-approval application show AddiVortes frequently achieves superior AUC and competitive accuracy relative to RF, BART, and XGBoost, with notable interpretability through variable inclusion and posterior intervals. The mortgage analysis demonstrates practical impact for financial decision-making, combining strong predictive performance with transparent, region-specific influence of covariates, and the approach is positioned for extensions to multinomial and time-to-event contexts.

Abstract

The Additive Voronoi Tessellations (AddiVortes) model is a multivariate regression model that uses multiple Voronoi tessellations to partition the covariate space for an additive ensemble model. In this paper, the AddiVortes framework is extended to binary classification by incorporating a probit model with a latent variable formulation. Specifically, we utilise a data augmentation technique, where a latent variable is introduced and the binary response is determined via thresholding. In most cases, the AddiVortes model outperforms random forests, BART and other leading black-box regression models when compared using a range of metrics. A comprehensive analysis is conducted using AddiVortes to predict an individual's likelihood of being approved for a home mortgage, based on a range of covariates. This evaluation highlights the model's effectiveness in capturing complex relationships within the data and its potential for improving decision-making in mortgage approval processes.

Paper Structure

This paper contains 7 sections, 16 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Example of predictive modelling using a two-dimensional Voronoi tessellation with centres at crosses, labelled with output values $(\mu_1,\ldots,\mu_8)$ associated to the given cell. Samples are represented by points with their output value corresponding to the cell they are in. AddiVortes.
  • Figure 2: The rotational axis function with $\theta = \pi/6$ (left); the predicted probability of rotated axis function for AddiVortes (middle) and BART (right).
  • Figure 3: The sinusoid function with $\alpha=0.5$ (left); the predicted probability of sinusoid for AddiVortes (middle) and BART (right).
  • Figure 4: The accuracy of prediction for AddiVortes and competing methods for the rotated axis function for varying $\theta$ (left) and the sinusoid function for varying $\alpha$.
  • Figure 5: A boxplot of the accuracy values for each competing method on the home mortgage dataset.
  • ...and 3 more figures