Table of Contents
Fetching ...

Pragmatic Theories Enhance Understanding of Implied Meanings in LLMs

Takuma Sato, Seiya Kawano, Koichiro Yoshino

TL;DR

This work shows that injecting concise summaries of Gricean pragmatics and Relevance Theory into zero-shot prompting can bootstrap LLMs to better infer implied meanings without task-specific guidance, achieving up to 9.6% higher accuracy on pragmatic reasoning benchmarks. The proposed Gricean and Relevance Theory prompts guide in-context reasoning, often outperforming baselines and, for some models, reaching or exceeding human performance, with Gricean prompting showing the strongest and most consistent gains. Analyses by phenomenon reveal the approach is especially effective for irony and is generally robust across open and closed models, though metaphoric and maxim-like cases can remain challenging. Additional experiments rule out simple confounds, supporting a genuine effect of theory-informed prompting, while acknowledging limitations related to context richness, cross-language generalization, and the need for deeper mechanistic understanding. Overall, the method offers a simple, broadly applicable prompt engineering technique to enhance pragmatic understanding in LLMs, with implications for dialogue systems and higher-level tasks requiring implicit meaning interpretation.

Abstract

The ability to accurately interpret implied meanings plays a crucial role in human communication and language use, and language models are also expected to possess this capability. This study demonstrates that providing language models with pragmatic theories as prompts is an effective in-context learning approach for tasks to understand implied meanings. Specifically, we propose an approach in which an overview of pragmatic theories, such as Gricean pragmatics and Relevance Theory, is presented as a prompt to the language model, guiding it through a step-by-step reasoning process to derive a final interpretation. Experimental results showed that, compared to the baseline, which prompts intermediate reasoning without presenting pragmatic theories (0-shot Chain-of-Thought), our methods enabled language models to achieve up to 9.6\% higher scores on pragmatic reasoning tasks. Furthermore, we show that even without explaining the details of pragmatic theories, merely mentioning their names in the prompt leads to a certain performance improvement (around 1-3%) in larger models compared to the baseline.

Pragmatic Theories Enhance Understanding of Implied Meanings in LLMs

TL;DR

This work shows that injecting concise summaries of Gricean pragmatics and Relevance Theory into zero-shot prompting can bootstrap LLMs to better infer implied meanings without task-specific guidance, achieving up to 9.6% higher accuracy on pragmatic reasoning benchmarks. The proposed Gricean and Relevance Theory prompts guide in-context reasoning, often outperforming baselines and, for some models, reaching or exceeding human performance, with Gricean prompting showing the strongest and most consistent gains. Analyses by phenomenon reveal the approach is especially effective for irony and is generally robust across open and closed models, though metaphoric and maxim-like cases can remain challenging. Additional experiments rule out simple confounds, supporting a genuine effect of theory-informed prompting, while acknowledging limitations related to context richness, cross-language generalization, and the need for deeper mechanistic understanding. Overall, the method offers a simple, broadly applicable prompt engineering technique to enhance pragmatic understanding in LLMs, with implications for dialogue systems and higher-level tasks requiring implicit meaning interpretation.

Abstract

The ability to accurately interpret implied meanings plays a crucial role in human communication and language use, and language models are also expected to possess this capability. This study demonstrates that providing language models with pragmatic theories as prompts is an effective in-context learning approach for tasks to understand implied meanings. Specifically, we propose an approach in which an overview of pragmatic theories, such as Gricean pragmatics and Relevance Theory, is presented as a prompt to the language model, guiding it through a step-by-step reasoning process to derive a final interpretation. Experimental results showed that, compared to the baseline, which prompts intermediate reasoning without presenting pragmatic theories (0-shot Chain-of-Thought), our methods enabled language models to achieve up to 9.6\% higher scores on pragmatic reasoning tasks. Furthermore, we show that even without explaining the details of pragmatic theories, merely mentioning their names in the prompt leads to a certain performance improvement (around 1-3%) in larger models compared to the baseline.

Paper Structure

This paper contains 35 sections, 6 figures, 15 tables.

Figures (6)

  • Figure 1: Accuracies on pragmatic inference task of PRAGMEGA. In most models, the proposed methods outperformed the baseline methods. The human scores indicate scores presented in the original paper by hu-etal-2023-fine-grained. Error bars represent 95% confidence intervals calculated using Wilson's method wilson-1927-probable-inference. Even with a short prompt for the pragmatic theory, larger models showed improvements from the proposed methods; however, the extent of improvement was smaller compared to when the theory was explained in detail.
  • Figure 2: Accuracy of the model for each pragmatic phenomenon included in PRAGMEGA hu-etal-2023-fine-grained when using different methods. Due to space constraints, we present the results for GPT-4o and Qwen2.5-7B-Instruct (for detailed results, including other models, see Appendix §\ref{['appendix:exact_result_by_phenomona']}). The human score is based on hu-etal-2023-fine-grained.
  • Figure 3: The number of instances for each error pattern by GPT-4o, as described in the main text. A cumulative bar chart represents these counts, including the distribution of each phenomenon within each pattern.
  • Figure 4: Results of the additional experiments
  • Figure 5: Correlation analysis between input length and accuracy.
  • ...and 1 more figures