"Oh LLM, I'm Asking Thee, Please Give Me a Decision Tree": Zero-Shot Decision Tree Induction and Embedding with Large Language Models

Ricardo Knauer; Mario Koddenbrock; Raphael Wallsberger; Nicholas M. Brisson; Georg N. Duda; Deborah Falla; David W. Evans; Erik Rodner

"Oh LLM, I'm Asking Thee, Please Give Me a Decision Tree": Zero-Shot Decision Tree Induction and Embedding with Large Language Models

Ricardo Knauer, Mario Koddenbrock, Raphael Wallsberger, Nicholas M. Brisson, Georg N. Duda, Deborah Falla, David W. Evans, Erik Rodner

TL;DR

This work shows that LLMs can be harnessed to generate intrinsically interpretable decision trees without training data, addressing low-data and privacy-constrained settings by leveraging the models' world knowledge. It proposes two zero-shot pipelines: (i) decision tree induction via carefully crafted prompts and (ii) tree-based embeddings produced through knowledge distillation and a binary-embedding transformation. Across 13 public and 2 private tabular datasets, zero-shot trees sometimes outperform data-driven counterparts, and embeddings yield statistically significant improvements over data-driven embeddings, establishing knowledge-driven baselines for low-data tasks. The approach offers a practical, privacy-preserving avenue to inject prior knowledge into tabular learning and motivates further exploration with advancing LLMs and probabilistic extensions.

Abstract

Large language models (LLMs) provide powerful means to leverage prior knowledge for predictive modeling when data is limited. In this work, we demonstrate how LLMs can use their compressed world knowledge to generate intrinsically interpretable machine learning models, i.e., decision trees, without any training data. We find that these zero-shot decision trees can even surpass data-driven trees on some small-sized tabular datasets and that embeddings derived from these trees perform better than data-driven tree-based embeddings on average. Our decision tree induction and embedding approaches can therefore serve as new knowledge-driven baselines for data-driven machine learning methods in the low-data regime. Furthermore, they offer ways to harness the rich world knowledge within LLMs for tabular machine learning tasks. Our code and results are available at https://github.com/ml-lab-htw/llm-trees.

"Oh LLM, I'm Asking Thee, Please Give Me a Decision Tree": Zero-Shot Decision Tree Induction and Embedding with Large Language Models

TL;DR

Abstract

Paper Structure (29 sections, 1 equation, 10 figures, 9 tables)

This paper contains 29 sections, 1 equation, 10 figures, 9 tables.

Introduction
Related Work
LLMs as Zero-Shot Model Generators
Zero-Shot Decision Tree Induction
Zero-Shot Decision Tree Embedding
Knowledge Distillation
Embedding Transformation
Output Formatting
Experimental Setup
Datasets
Decision Tree Induction Setup
Methods
Evaluation Metrics
Decision Tree Embedding Setup
Methods
...and 14 more sections

Figures (10)

Figure 1: Embedding transformation $\chi_2$ using the iris data decision tree from Listing \ref{['lst:prompt_template']} and a sample with a petal width of 1.70. The 2 inner nodes of the structured tree $T$ are mapped to a binary embedding vector of dimension 2. Thus, information from the leaf nodes is indirectly embedded.
Figure 2: Test F1-score at 67%/33% train/test splits for our LLM-based zero-shot decision tree induction approach compared to the machine learning baselines on our private (a) ACL injury data and (b) post-trauma pain data.
Figure 3: Test F1-score at 67%/33% train/test splits of a multi-layer perceptron without embeddings, with our LLM-based zero-shot decision tree embeddings, as well as with unsupervised, self-supervised, and supervised embedding baselines on our private (a) ACL injury data and (b) post-trauma pain data.
Figure 4: Critical difference diagrams to detect pairwise test score differences between our methods at 67%/33% train/test splits for the public and private datasets, based on the Holm-adjusted Wilcoxon signed-rank test benavoli2016shoulddemvsar2006statistical. Approaches that are not statistically different at the 0.05 significance level are connected by a bold horizontal bar. Missing performance scores were imputed with 0 before running the tests.
Figure 5: Test balanced accuracy at 67%/33% train/test splits for our LLM-based zero-shot decision tree induction approach compared to the machine learning baselines on our private (a) ACL injury data and (b) post-trauma pain data.
...and 5 more figures

"Oh LLM, I'm Asking Thee, Please Give Me a Decision Tree": Zero-Shot Decision Tree Induction and Embedding with Large Language Models

TL;DR

Abstract

"Oh LLM, I'm Asking Thee, Please Give Me a Decision Tree": Zero-Shot Decision Tree Induction and Embedding with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (10)