Promptly Predicting Structures: The Return of Inference
Maitrey Mehta, Valentina Pyatkin, Vivek Srikumar
TL;DR
This work addresses structured prediction in NLP under zero- and few-shot regimes by pairing prompt-based local decisions with global inference to enforce structural constraints. The authors formalize unary potentials through prompts as $P(Y|X,Q) = \prod_i P(y_i|X,q_i)$ and frame structure as a constrained optimization problem $Y^* = \max_Y \prod_i P(y_i|X,q_i)$ subject to structural validity, solvable via methods like ILP or shortest-path searches. They instantiate the framework on Semantic Role Labeling and Coreference Resolution across five datasets, showing that enforcing consistency not only yields valid outputs but also improves task performance versus unconstrained prompts. The results demonstrate that constraint-driven inference reduces output invalidity, enhances cross-task robustness, and can compensate for smaller model sizes, with notable gains from iterative prompting and few-shot setups. The approach offers a practical pathway to reliable structured predictions with minimal labeled data, improving applicability of LLMs to linguistically structured tasks in diverse domains.
Abstract
Prompt-based methods have been used extensively across NLP to build zero- and few-shot label predictors. Many NLP tasks are naturally structured: that is, their outputs consist of multiple labels which constrain each other. Annotating data for such tasks can be cumbersome. Can the promise of the prompt-based paradigm be extended to such structured outputs? In this paper, we present a framework for constructing zero- and few-shot linguistic structure predictors. Our key insight is that we can use structural constraints -- and combinatorial inference derived from them -- to filter out inconsistent structures predicted by large language models. We instantiated this framework on two structured prediction tasks, and five datasets. Across all cases, our results show that enforcing consistency not only constructs structurally valid outputs, but also improves performance over the unconstrained variants.
