Discriminative classification with generative features: bridging Naive Bayes and logistic regression
Zachary Terner, Alexander Petersen, Yuedong Wang
TL;DR
The paper tackles the problem of boosting discriminative classification by leveraging generative information. It introduces Smart Bayes, a framework that uses marginal log-density-ratio features as inputs to a logistic-regression model, effectively blending Naive Bayes style density-ratio cues with discriminative learning. A spline-based, univariate density-ratio estimator provides practical, flexible feature construction, and the resulting approach often outperforms both Naive Bayes and traditional logistic regression on simulations and real-world datasets. The work highlights the value of hybrid generative-discriminative strategies and opens avenues for extending density-ratio features to other learners and applications.
Abstract
We introduce Smart Bayes, a new classification framework that bridges generative and discriminative modeling by integrating likelihood-ratio-based generative features into a logistic-regression-style discriminative classifier. From the generative perspective, Smart Bayes relaxes the fixed unit weights of Naive Bayes by allowing data-driven coefficients on density-ratio features. From a discriminative perspective, it constructs transformed inputs as marginal log-density ratios that explicitly quantify how much more likely each feature value is under one class than another, thereby providing predictors with stronger class separation than the raw covariates. To support this framework, we develop a spline-based estimator for univariate log-density ratios that is flexible, robust, and computationally efficient. Through extensive simulations and real-data studies, Smart Bayes often outperforms both logistic regression and Naive Bayes. Our results highlight the potential of hybrid approaches that exploit generative structure to enhance discriminative performance.
