Table of Contents
Fetching ...

Reading Between the Lines: The One-Sided Conversation Problem

Victoria Ebert, Rishabh Singh, Tuochao Chen, Noah A. Smith, Shyamnath Gollakota

TL;DR

This paper formalizes the one-sided conversation (1SC) problem, addressing how to infer and learn from a dialogue when only one speaker is observed. It studies two tasks—reconstructing missing turns in real time and generating summaries from one-sided transcripts—evaluated on MultiWOZ, DailyDialog, and Candor using both human judgments and LLM-based metrics. The authors compare finetuning small models versus prompting large models for reconstruction, finding that larger models with prompts perform best and that placeholders help control hallucinations, while high-quality summaries can be produced directly from one-sided input without reconstruction in many settings. The work highlights privacy-aware AI implications, outlines practical evaluation frameworks, and points to future directions in infilling, controllable generation, and privacy-preserving deployment.

Abstract

Conversational AI is constrained in many real-world settings where only one side of a dialogue can be recorded, such as telemedicine, call centers, and smart glasses. We formalize this as the one-sided conversation problem (1SC): inferring and learning from one side of a conversation. We study two tasks: (1) reconstructing the missing speaker's turns for real-time use cases, and (2) generating summaries from one-sided transcripts. Evaluating prompting and finetuned models on MultiWOZ, DailyDialog, and Candor with both human A/B testing and LLM-as-a-judge metrics, we find that access to one future turn and information about utterance length improves reconstruction, placeholder prompting helps to mitigate hallucination, and while large models generate promising reconstructions with prompting, smaller models require finetuning. Further, high-quality summaries can be generated without reconstructing missing turns. We present 1SC as a novel challenge and report promising results that mark a step toward privacy-aware conversational AI.

Reading Between the Lines: The One-Sided Conversation Problem

TL;DR

This paper formalizes the one-sided conversation (1SC) problem, addressing how to infer and learn from a dialogue when only one speaker is observed. It studies two tasks—reconstructing missing turns in real time and generating summaries from one-sided transcripts—evaluated on MultiWOZ, DailyDialog, and Candor using both human judgments and LLM-based metrics. The authors compare finetuning small models versus prompting large models for reconstruction, finding that larger models with prompts perform best and that placeholders help control hallucinations, while high-quality summaries can be produced directly from one-sided input without reconstruction in many settings. The work highlights privacy-aware AI implications, outlines practical evaluation frameworks, and points to future directions in infilling, controllable generation, and privacy-preserving deployment.

Abstract

Conversational AI is constrained in many real-world settings where only one side of a dialogue can be recorded, such as telemedicine, call centers, and smart glasses. We formalize this as the one-sided conversation problem (1SC): inferring and learning from one side of a conversation. We study two tasks: (1) reconstructing the missing speaker's turns for real-time use cases, and (2) generating summaries from one-sided transcripts. Evaluating prompting and finetuned models on MultiWOZ, DailyDialog, and Candor with both human A/B testing and LLM-as-a-judge metrics, we find that access to one future turn and information about utterance length improves reconstruction, placeholder prompting helps to mitigate hallucination, and while large models generate promising reconstructions with prompting, smaller models require finetuning. Further, high-quality summaries can be generated without reconstructing missing turns. We present 1SC as a novel challenge and report promising results that mark a step toward privacy-aware conversational AI.

Paper Structure

This paper contains 57 sections, 7 figures, 4 tables.

Figures (7)

  • Figure 1: We introduce the one-sided conversation (1SC) problem: making inferences from only one side of a conversation transcript. We focus on reconstruction of the missing content and creating summaries of the whole one-sided conversation.
  • Figure 2: Examples of different levels of context we consider for other party reconstruction. Context is marked by green boxes; in this example we are predicting turn 4. Our baseline full context version (a) gives the whole conversation up to turn 4. We also experiment with including turn 5 (b), and including the length of each masked utterance (c). We finally test a local context version (d) that only gives turns 3, 4, and 5.
  • Figure 3: Using our extraction based metrics, we show macro-averaged precision and recall score for each dataset ($n=$ 493 for DailyDialog, $n=$ 705 for MultiWOZ, $n =$2,371 for Candor). Note that since the details for the precision and recall values are extracted by the evaluator LLM, the absolute numbers are not as meaningful as the relative differences between methods.
  • Figure 4: Example cases of our evaluation rubric for other party reconstruction showing high (a), average (b), and low (c) rubric scores.
  • Figure 5: Human evaluation summary results. The masked-dialogue summary either outperformed the reconstructed-dialogue summary (DailyDialog) or performed similarly (MultiWoz).
  • ...and 2 more figures