JoLT: Joint Probabilistic Predictions on Tabular Data Using LLMs
Aliaksandra Shysheya, John Bronskill, James Requeima, Shoaib Ahmed Siddiqui, Javier Gonzalez, David Duvenaud, Richard E. Turner
TL;DR
JoLT introduces Joint LLMP for Tabular data, a simple, prompt-based framework that uses in-context learning with LLMs to generate joint probabilistic predictions over heterogeneous tabular outputs without training or preprocessing. It supports missing data implicitly, leverages textual side information to refine predictions, and offers both sample-based and full-distribution inference via LLM logits, enabling uncertainty quantification. Across classification and multi-target tasks in low-shot settings, JoLT often surpasses strong baselines, particularly when side information is available, and demonstrates competitive imputation capabilities. The work highlights practical advantages for real-world prediction problems, while acknowledging context-size and computational constraints, and outlines directions for scaling with larger models and richer side information.
Abstract
We introduce a simple method for probabilistic predictions on tabular data based on Large Language Models (LLMs) called JoLT (Joint LLM Process for Tabular data). JoLT uses the in-context learning capabilities of LLMs to define joint distributions over tabular data conditioned on user-specified side information about the problem, exploiting the vast repository of latent problem-relevant knowledge encoded in LLMs. JoLT defines joint distributions for multiple target variables with potentially heterogeneous data types without any data conversion, data preprocessing, special handling of missing data, or model training, making it accessible and efficient for practitioners. Our experiments show that JoLT outperforms competitive methods on low-shot single-target and multi-target tabular classification and regression tasks. Furthermore, we show that JoLT can automatically handle missing data and perform data imputation by leveraging textual side information. We argue that due to its simplicity and generality, JoLT is an effective approach for a wide variety of real prediction problems.
