Count-Based Approaches Remain Strong: A Benchmark Against Transformer and LLM Pipelines on Structured EHR
Jifan Gao, Michael Rosenthal, Brian Wolpin, Simona Cristea
TL;DR
Count-based approaches using ontology roll-ups remain competitive for structured EHR prediction, even with transformer and mixture-of-agents pipelines. The study conducts a head-to-head benchmark across three model families on eight tasks from the EHRSHOT dataset, including two label schemes. Across tasks, count-based models and the MoA pipeline win roughly in tandem, while CLMBR typically lags, highlighting the enduring value of simple, interpretable, and data-efficient approaches. The findings underscore that traditional tabular methods are strong baselines for structured EHR, while MoA and related LLM-based methods can provide task-specific gains and interpretability advantages.
Abstract
Structured electronic health records (EHR) are essential for clinical prediction. While count-based learners continue to perform strongly on such data, no benchmarking has directly compared them against more recent mixture-of-agents LLM pipelines, which have been reported to outperform single LLMs in various NLP tasks. In this study, we evaluated three categories of methodologies for EHR prediction using the EHRSHOT dataset: count-based models built from ontology roll-ups with two time bins, based on LightGBM and the tabular foundation model TabPFN; a pretrained sequential transformer (CLMBR); and a mixture-of-agents pipeline that converts tabular histories to natural-language summaries followed by a text classifier. We assessed eight outcomes using the EHRSHOT dataset. Across the eight evaluation tasks, head-to-head wins were largely split between the count-based and the mixture-of-agents methods. Given their simplicity and interpretability, count-based models remain a strong candidate for structured EHR benchmarking. The source code is available at: https://github.com/cristea-lab/Structured_EHR_Benchmark.
