User Feedback in Human-LLM Dialogues: A Lens to Understand Users But Noisy as a Learning Signal
Yuhan Liu, Michael J. Q. Zhang, Eunsol Choi
TL;DR
This work systematically investigates implicit user feedback in real-world human–LLM dialogues, formalizing multi-turn interactions and feedback ontologies while building dense, manually annotated datasets on LMSYS and WildChat. It analyzes when feedback arises, its linguistic characteristics, and its potential as a learning signal, finding that toxicity and prompt quality intricately influence feedback patterns. The authors then explore regenerating model outputs using feedback semantics and train LLMs on regenerated data, revealing that strong LLMs can help weaker models and yield gains on MTBench, but results on more complex, real-world benchmarks (WildBench) are mixed. The findings underscore both the promise and the challenges of leveraging implicit, noisy user feedback for scalable alignment in deployed systems, highlighting the need for careful data, model strength, and task complexity considerations.
Abstract
Once language models (LMs) are deployed, they can interact with users long-term, ideally evolving based on their feedback. Asking for direct user feedback can be disruptive; thus, we study harvesting implicit user feedback from user-LM interaction logs. We study two user-LM interaction datasets (WildChat and LMSYS). First, we analyze user feedback in the user-LLM conversation logs, providing insights into when and why such feedback occurs. Second, we study harvesting learning signals from such implicit user feedback. Specifically, we study whether incorporating the contents of user feedback (e.g., user wanted clarification), in addition to the polarity of the feedback, can improve the model performance. We observe mixed results, showing this helps in short human-designed questions (MTBench) but not on longer and more complex questions (WildBench). Together, we provide an in-depth study of implicit user feedback, showing its potential and limitations.
