Table of Contents
Fetching ...

Discriminative classification with generative features: bridging Naive Bayes and logistic regression

Zachary Terner, Alexander Petersen, Yuedong Wang

TL;DR

The paper tackles the problem of boosting discriminative classification by leveraging generative information. It introduces Smart Bayes, a framework that uses marginal log-density-ratio features as inputs to a logistic-regression model, effectively blending Naive Bayes style density-ratio cues with discriminative learning. A spline-based, univariate density-ratio estimator provides practical, flexible feature construction, and the resulting approach often outperforms both Naive Bayes and traditional logistic regression on simulations and real-world datasets. The work highlights the value of hybrid generative-discriminative strategies and opens avenues for extending density-ratio features to other learners and applications.

Abstract

We introduce Smart Bayes, a new classification framework that bridges generative and discriminative modeling by integrating likelihood-ratio-based generative features into a logistic-regression-style discriminative classifier. From the generative perspective, Smart Bayes relaxes the fixed unit weights of Naive Bayes by allowing data-driven coefficients on density-ratio features. From a discriminative perspective, it constructs transformed inputs as marginal log-density ratios that explicitly quantify how much more likely each feature value is under one class than another, thereby providing predictors with stronger class separation than the raw covariates. To support this framework, we develop a spline-based estimator for univariate log-density ratios that is flexible, robust, and computationally efficient. Through extensive simulations and real-data studies, Smart Bayes often outperforms both logistic regression and Naive Bayes. Our results highlight the potential of hybrid approaches that exploit generative structure to enhance discriminative performance.

Discriminative classification with generative features: bridging Naive Bayes and logistic regression

TL;DR

The paper tackles the problem of boosting discriminative classification by leveraging generative information. It introduces Smart Bayes, a framework that uses marginal log-density-ratio features as inputs to a logistic-regression model, effectively blending Naive Bayes style density-ratio cues with discriminative learning. A spline-based, univariate density-ratio estimator provides practical, flexible feature construction, and the resulting approach often outperforms both Naive Bayes and traditional logistic regression on simulations and real-world datasets. The work highlights the value of hybrid generative-discriminative strategies and opens avenues for extending density-ratio features to other learners and applications.

Abstract

We introduce Smart Bayes, a new classification framework that bridges generative and discriminative modeling by integrating likelihood-ratio-based generative features into a logistic-regression-style discriminative classifier. From the generative perspective, Smart Bayes relaxes the fixed unit weights of Naive Bayes by allowing data-driven coefficients on density-ratio features. From a discriminative perspective, it constructs transformed inputs as marginal log-density ratios that explicitly quantify how much more likely each feature value is under one class than another, thereby providing predictors with stronger class separation than the raw covariates. To support this framework, we develop a spline-based estimator for univariate log-density ratios that is flexible, robust, and computationally efficient. Through extensive simulations and real-data studies, Smart Bayes often outperforms both logistic regression and Naive Bayes. Our results highlight the potential of hybrid approaches that exploit generative structure to enhance discriminative performance.

Paper Structure

This paper contains 12 sections, 17 equations, 8 figures.

Figures (8)

  • Figure 1: Average misclassification rate over 100 replications at different training sizes for the multivariate $t$ with 5, 10, and 30 degrees of freedom and the Gaussian. Error rates were computed at every 30 observations. These simulations used $p = 32$ covariates, where the mean vector $\mathbf{\mu}$ and the covariances $\mathbf{\Sigma}$ used in the data generation were the sample means and covariances for the two classes in the ionosphere dataset.
  • Figure 2: Average misclassification rate over 200 replications at different training sizes for the abalone dataset. Observations corresponding to the middle two quartiles of the response variable were removed from the dataset, so classification took place on data that were above the 3rd quartile (Class 1) or below the 1st quartile (Class 0).
  • Figure 3: Average misclassification rate over 200 replications at different training sizes for the adult dataset. Non-continuous variables were removed from the classification.
  • Figure 4: Average misclassification rate over 200 replications at different training sizes for the Boston housing dataset. The response variable was whether the median value of the home exceeded the median value of all homes. Non-continuous variables were removed from the classification. This dataset is named Boston and is available via the MASS library mass.
  • Figure 5: Average misclassification rate over 200 replications at different training sizes for the ionosphere dataset. Non-continuous variables were removed from the classification.
  • ...and 3 more figures