Large Language Model Enhanced Machine Learning Estimators for Classification
Yuhang Wu, Yingfei Wang, Chu Wang, Zeyu Zheng
TL;DR
The paper tackles improving binary classification by integrating pre-trained large language models (LLMs) with classical ML estimators. It proposes three complementary strategies: (i) linear ensembles with fixed or adaptive weights to combine ML predictions and LLM signals, (ii) calibration and multi-accuracy approaches that incorporate LLM outputs to produce well-calibrated probabilities, and (iii) transfer-learning via LLM-driven data augmentation under covariate shift. Key contributions include the adaptive AdaLinear weighting, two calibration schemes (including a low-parameter, discretized approach), and a transfer-learning method that adds LLM-labeled target samples with a weakly supervised loss. Across four public NLP datasets (e.g., WANDS, Yelp, Emotion, Hate), the integrated methods consistently outperform either LLM or ML alone, demonstrating improved prediction accuracy and robustness to distribution changes, with practical implications for deploying LLM-enhanced classifiers in real-world tasks.
Abstract
Pre-trained large language models (LLM) have emerged as a powerful tool for simulating various scenarios and generating output given specific instructions and multimodal input. In this work, we analyze the specific use of LLM to enhance a classical supervised machine learning method for classification problems. We propose a few approaches to integrate LLM into a classical machine learning estimator to further enhance the prediction performance. We examine the performance of the proposed approaches through both standard supervised learning binary classification tasks, and a transfer learning task where the test data observe distribution changes compared to the training data. Numerical experiments using four publicly available datasets are conducted and suggest that using LLM to enhance classical machine learning estimators can provide significant improvement on prediction performance.
