LLM-based feature generation from text for interpretable machine learning

Vojtěch Balek; Lukáš Sýkora; Vilém Sklenák; Tomáš Kliegr

LLM-based feature generation from text for interpretable machine learning

Vojtěch Balek, Lukáš Sýkora, Vilém Sklenák, Tomáš Kliegr

TL;DR

The paper tackles the challenge of obtaining interpretable text representations for rule-learning by using large language models (LLMs) to generate a small, semantically meaningful set of features. It introduces two workflows—manual feature specification and automatic feature discovery—evaluates them on five diverse datasets, and demonstrates predictive performance competitive with SciBERT while preserving interpretability. SHAP-based explanations and action-rule mining show that these features align with domain knowledge and yield actionable recommendations for article improvement. The results generalize across domains and highlight the practical potential of LLM-derived, interpretable features for transparent decision-making in text analytics.

Abstract

Existing text representations such as embeddings and bag-of-words are not suitable for rule learning due to their high dimensionality and absent or questionable feature-level interpretability. This article explores whether large language models (LLMs) could address this by extracting a small number of interpretable features from text. We demonstrate this process on two datasets (CORD-19 and M17+) containing several thousand scientific articles from multiple disciplines and a target being a proxy for research impact. An evaluation based on testing for the statistically significant correlation with research impact has shown that LLama 2-generated features are semantically meaningful. We consequently used these generated features in text classification to predict the binary target variable representing the citation rate for the CORD-19 dataset and the ordinal 5-class target representing an expert-awarded grade in the M17+ dataset. Machine-learning models trained on the LLM-generated features provided similar predictive performance to the state-of-the-art embedding model SciBERT for scientific text. The LLM used only 62 features compared to 768 features in SciBERT embeddings, and these features were directly interpretable, corresponding to notions such as article methodological rigor, novelty, or grammatical correctness. As the final step, we extract a small number of well-interpretable action rules. Consistently competitive results obtained with the same LLM feature set across both thematically diverse datasets show that this approach generalizes across domains.

LLM-based feature generation from text for interpretable machine learning

TL;DR

Abstract

Paper Structure (33 sections, 13 equations, 5 figures, 15 tables)

This paper contains 33 sections, 13 equations, 5 figures, 15 tables.

Introduction
Methods
Input data
Scientometric datasets
Other datasets
Machine-learning algorithms and their hyperparameters
LLM-based feature generation based on user-set feature list
LLM-based feature generation with automatic feature discovery
Explainability
Feature subsets
Action rules
Results
Computational resources and complexity
LLMs on local GPUs
LLMs on cloud GPUs
...and 18 more sections

Figures (5)

Figure 1: SHAP plot for a classification model trained on LLM-feature only feature set for the M17+ dataset: automated LLM-based feature discovery combined with LLM-based feature generation (left) vs user-selected features combined with LLM-based feature generation (right). The X-axis represents the SHAP value, indicating the feature's effect on the prediction—positive values increase the prediction, while negative values decrease it. Features are ranked by importance on the Y-axis. The color of the points corresponds to the feature values. For binary features like grammar or basic_medicine, blue indicates a value of 0, while red represents a value of 1. For categorical features such as rigor or novelty, blue denotes a low value, purple indicates a medium value, and red signifies a high value.
Figure 2: SHAP plot for classification models trained on LLM-feature only feature set for the Hate Speech dataset, target class 1 (hate speech present) (left) and the Food Hazard dataset target class 1 (biological food hazard) (right). For guidance on the general interpretation of the SHAP plot, refer to the caption of the previous figure.
Figure 3: Survey of user-perceived relevance of automatically discovered features. Each survey participant rated the relevance of 20 features that were automatically discovered for one of five datasets (CORD-19, M17+, BANKING77, Hate Speech and Food Hazard).
Figure 4: Full prompt for LLM-based feature discovery. Prompt can be parameterised by inserting specific values instead of placeholders, which are denoted with $ symbol.
Figure 5: Example of an automatically generated prompt used for LLM-based feature generation for a single instance from the Food Hazard dataset. Prompt can be parameterised by inserting specific text instead of the placeholder $text.

LLM-based feature generation from text for interpretable machine learning

TL;DR

Abstract

LLM-based feature generation from text for interpretable machine learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)