Do prompt positions really matter?

Junyu Mao; Stuart E. Middleton; Mahesan Niranjan

Do prompt positions really matter?

Junyu Mao, Stuart E. Middleton, Mahesan Niranjan

TL;DR

This paper addresses whether prompt position matters in prompt-based learning by conducting a comprehensive, cross-task study across zero-shot and few-shot settings, multiple prompt styles (cloze and prefix), and both gradient-based and gradient-free paradigms. It shows substantial performance variability across prompt positions with no universally superior position, and finds that larger model scales reduce but do not erase this variance while instruction-tuned models do not consistently mitigate it. The work highlights that many commonly used prompt positions in prior studies are suboptimal and argues for prompt position optimization and position-aware instruction tuning as promising directions for robustness. These findings have practical implications for prompt engineering and enable researchers to design more reliable prompts when data are scarce or tasks are diverse.

Abstract

Prompt-based models have gathered a lot of attention from researchers due to their remarkable advancements in the fields of zero-shot and few-shot learning. Developing an effective prompt template plays a critical role. However, prior studies have mainly focused on prompt vocabulary searching or embedding initialization within a predefined template with the prompt position fixed. In this empirical study, we conduct the most comprehensive analysis to date of prompt position for diverse Natural Language Processing (NLP) tasks. Our findings quantify the substantial impact prompt position has on model performance. We observe that the prompt positions used in prior studies are often sub-optimal, and this observation is consistent even in widely used instruction-tuned models. These findings suggest prompt position optimisation as a valuable research direction to augment prompt engineering methodologies and prompt position-aware instruction tuning as a potential way to build more robust models in the future.

Do prompt positions really matter?

TL;DR

Abstract

Do prompt positions really matter?

Authors

TL;DR

Abstract

Table of Contents

Figures (1)