Post Hoc Regression Refinement via Pairwise Rankings
Kevin Tirta Wijaya, Michael Sun, Minghao Guo, Hans-Peter Seidel, Wojciech Matusik, Vahid Babaei
TL;DR
RankRefine addresses the challenge of accurate regression in data-scarce regimes by incorporating small sets of pairwise rankings as auxiliary signals. It fuses the base regressor output with a rank-based estimate derived from a Bradley–Terry likelihood via inverse-variance weighting, yielding a minimum-variance unbiased predictor under Gaussian assumptions. Theoretical analysis guarantees MAE reduction whenever the ranker variance is finite, and empirical results across nine molecular-property benchmarks and several tabular tasks demonstrate consistent improvements, even with ranker accuracies as low as ~0.55 and with modest reference budgets ($k \,\approx\,20$). The approach also works with off-the-shelf LLMs (e.g., ChatGPT-4o) and human raters, highlighting practical applicability in low-data settings and interactive decision-making contexts.
Abstract
Accurate prediction of continuous properties is essential to many scientific and engineering tasks. Although deep-learning regressors excel with abundant labels, their accuracy deteriorates in data-scarce regimes. We introduce RankRefine, a model-agnostic, plug-and-play post hoc method that refines regression with expert knowledge coming from pairwise rankings. Given a query item and a small reference set with known properties, RankRefine combines the base regressor's output with a rank-based estimate via inverse variance weighting, requiring no retraining. In molecular property prediction task, RankRefine achieves up to 10% relative reduction in mean absolute error using only 20 pairwise comparisons obtained through a general-purpose large language model (LLM) with no finetuning. As rankings provided by human experts or general-purpose LLMs are sufficient for improving regression across diverse domains, RankRefine offers practicality and broad applicability, especially in low-data settings.
