More Bias, Less Bias: BiasPrompting for Enhanced Multiple-Choice Question Answering
Duc Anh Vu, Thong Nguyen, Cong-Duy Nguyen, Viet Anh Nguyen, Anh Tuan Luu
TL;DR
BiasPrompting addresses bias and contextual grounding gaps in MCQ answering by forcing LLMs to generate and compare reasoning for all answer options before deciding. The method comprises two stages: Reasoning Generation and Reasonings-Guided Agreement, enabling comprehensive exploration of options and more informed final predictions. Empirical results across five MCQ benchmarks with open-source 7B LLMs show consistent accuracy gains, greater stability than Chain-of-Thought prompting, and reduced token usage, while mitigating option-order biases. The work highlights latent reasoning capabilities within a single LLM and suggests BiasPrompting as a practical, efficient approach to enhancing MCQ reasoning and potentially broader reasoning tasks.
Abstract
With the advancement of large language models (LLMs), their performance on multiple-choice question (MCQ) tasks has improved significantly. However, existing approaches face key limitations: answer choices are typically presented to LLMs without contextual grounding or explanation. This absence of context can lead to incomplete exploration of all possible answers, ultimately degrading the models' reasoning capabilities. To address these challenges, we introduce BiasPrompting, a novel inference framework that guides LLMs to generate and critically evaluate reasoning across all plausible answer options before reaching a final prediction. It consists of two components: first, a reasoning generation stage, where the model is prompted to produce supportive reasonings for each answer option, and then, a reasoning-guided agreement stage, where the generated reasonings are synthesized to select the most plausible answer. Through comprehensive evaluations, BiasPrompting demonstrates significant improvements in five widely used multiple-choice question answering benchmarks. Our experiments showcase that BiasPrompting enhances the reasoning capabilities of LLMs and provides a strong foundation for tackling complex and challenging questions, particularly in settings where existing methods underperform.
