Towards Universal Neural Likelihood Inference
Shreyas Bhat Brahmavar, Yang Li, Qiyang Liu, Shashank Srivastava, Junier Oliva
TL;DR
The paper tackles universal likelihood inference across heterogeneous tabular data by introducing ASPIRE, a single model capable of outputting data-grounded conditional likelihoods for arbitrary targets. ASPIRE uses a universal likelihood framework with permutation-invariant, set-based reasoning, combining feature-value atoms, intra- and inter-instance processing, and semantic grounding through dataset descriptions. It achieves state-of-the-art zero-, few-, and many-shot performance across 1400 real-world datasets, while enabling open-world active feature acquisition that selects informative features at inference time. The work demonstrates the practical impact of combining semantic grounding, permutation-aware inference, and probabilistic conditioning for cross-domain open-world inference and adaptive data acquisition.
Abstract
We introduce universal neural likelihood inference (UNLI): enabling a single model to provide data-grounded, conditional likelihood predictions for arbitrary targets given any collection of observed features, across diverse domains and tasks. To achieve UNLI over heterogeneous tabular data, we develop the Arbitrary Set-based Permutation-Invariant Reasoning Engine (ASPIRE) model. Our design addresses critical gaps in existing approaches to merge semantic-understanding capabilities and generalised numerical feature reasoning within a zero-shot capable framework. Trained on over 1,400 real diverse datasets spanning various domains, ASPIRE achieves 15\% higher F1 scores and 85\% lower RMSE than existing tabular foundation models in zero-shot and few-shot settings. Lastly, this work introduces open-world active feature acquisition, where we leverage the UNLI capabilities of ASPIRE to adeptly determine next feature-values to observe to improve inference time prediction accuracies.
