Language Models in Dialogue: Conversational Maxims for Human-AI Interactions
Erik Miehling, Manish Nagireddy, Prasanna Sattigeri, Elizabeth M. Daly, David Piorkowski, John T. Richards
TL;DR
Modern language models exhibit conversational shortcomings, partly due to not following core conversational principles. The authors introduce an augmented framework of maxims—quantity, quality, relevance, manner, benevolence, and transparency—and argue they apply to human-AI dialogue, with benevolence and transparency addressing AI-specific risks. They operationalize the framework by using 1000 conversation samples from Anthropic hh-rlhf and evaluating three LLMs on submaxim labeling to reveal how models internally prioritize maxims. The study provides a taxonomy and a practical path for evaluating, guiding, and aligning AI conversation behaviors, with implications for lightweight detectors, labeling workflows, and constitutional alignment directives.
Abstract
Modern language models, while sophisticated, exhibit some inherent shortcomings, particularly in conversational settings. We claim that many of the observed shortcomings can be attributed to violation of one or more conversational principles. By drawing upon extensive research from both the social science and AI communities, we propose a set of maxims -- quantity, quality, relevance, manner, benevolence, and transparency -- for describing effective human-AI conversation. We first justify the applicability of the first four maxims (from Grice) in the context of human-AI interactions. We then argue that two new maxims, benevolence (concerning the generation of, and engagement with, harmful content) and transparency (concerning recognition of one's knowledge boundaries, operational constraints, and intents), are necessary for addressing behavior unique to modern human-AI interactions. We evaluate the degree to which various language models are able to understand these maxims and find that models possess an internal prioritization of principles that can significantly impact their ability to interpret the maxims accurately.
