"Oh LLM, I'm Asking Thee, Please Give Me a Decision Tree": Zero-Shot Decision Tree Induction and Embedding with Large Language Models
Ricardo Knauer, Mario Koddenbrock, Raphael Wallsberger, Nicholas M. Brisson, Georg N. Duda, Deborah Falla, David W. Evans, Erik Rodner
TL;DR
This work shows that LLMs can be harnessed to generate intrinsically interpretable decision trees without training data, addressing low-data and privacy-constrained settings by leveraging the models' world knowledge. It proposes two zero-shot pipelines: (i) decision tree induction via carefully crafted prompts and (ii) tree-based embeddings produced through knowledge distillation and a binary-embedding transformation. Across 13 public and 2 private tabular datasets, zero-shot trees sometimes outperform data-driven counterparts, and embeddings yield statistically significant improvements over data-driven embeddings, establishing knowledge-driven baselines for low-data tasks. The approach offers a practical, privacy-preserving avenue to inject prior knowledge into tabular learning and motivates further exploration with advancing LLMs and probabilistic extensions.
Abstract
Large language models (LLMs) provide powerful means to leverage prior knowledge for predictive modeling when data is limited. In this work, we demonstrate how LLMs can use their compressed world knowledge to generate intrinsically interpretable machine learning models, i.e., decision trees, without any training data. We find that these zero-shot decision trees can even surpass data-driven trees on some small-sized tabular datasets and that embeddings derived from these trees perform better than data-driven tree-based embeddings on average. Our decision tree induction and embedding approaches can therefore serve as new knowledge-driven baselines for data-driven machine learning methods in the low-data regime. Furthermore, they offer ways to harness the rich world knowledge within LLMs for tabular machine learning tasks. Our code and results are available at https://github.com/ml-lab-htw/llm-trees.
