Table of Contents
Fetching ...

Reading with Intent -- Neutralizing Intent

Benjamin Reichman, Adar Avsian, Larry Heck

TL;DR

This work addresses how diverse emotional tones in retrieved internet passages affect RAG systems by extending Reading with Intent with an $11$-emotion synthetic dataset and training an emotion-translator to neutralize passages. A detailed dataset creation pipeline uses open-domain QA (Natural Questions) and GPL-based retrieval to build a $370{,}920$ passage top-$10$ corpus, expanding to a synthetic corpus of $3{,}636{,}592$ passages across $11$ emotions through a multi-model generation process. The emotion-translator (fine-tuned on Llama-3.1-8B-Instruct with LoRA) demonstrates high fidelity in both emotion transfer and content preservation, validated by human judgments and BLEU-based metrics. When applied to the Reading with Intent task, neutralization yields a $2.8\%$ average improvement on sarcastic but factually accurate contexts, while gains on fact-distorted sarcastic data are negligible; overall, the approach shows promise but also highlights limitations in handling mixed sarcasm and content deception in real-world settings.

Abstract

Queries to large language models (LLMs) can be divided into two parts: the instruction/question and the accompanying context. The context for retrieval-augmented generation (RAG) systems in most benchmarks comes from Wikipedia or Wikipedia-like texts which are written in a neutral and factual tone. However, when RAG systems retrieve internet-based content, they encounter text with diverse tones and linguistic styles, introducing challenges for downstream tasks. The Reading with Intent task addresses this issue by evaluating how varying tones in context passages affect model performance. Building on prior work that focused on sarcasm, we extend this paradigm by constructing a dataset where context passages are transformed to $11$ distinct emotions using a better synthetic data generation approach. Using this dataset, we train an emotion translation model to systematically adapt passages to specified emotional tones. The human evaluation shows that the LLM fine-tuned to become the emotion-translator benefited from the synthetically generated data. Finally, the emotion-translator is used in the Reading with Intent task to transform the passages to a neutral tone. By neutralizing the passages, it mitigates the challenges posed by sarcastic passages and improves overall results on this task by about $3\%$.

Reading with Intent -- Neutralizing Intent

TL;DR

This work addresses how diverse emotional tones in retrieved internet passages affect RAG systems by extending Reading with Intent with an -emotion synthetic dataset and training an emotion-translator to neutralize passages. A detailed dataset creation pipeline uses open-domain QA (Natural Questions) and GPL-based retrieval to build a passage top- corpus, expanding to a synthetic corpus of passages across emotions through a multi-model generation process. The emotion-translator (fine-tuned on Llama-3.1-8B-Instruct with LoRA) demonstrates high fidelity in both emotion transfer and content preservation, validated by human judgments and BLEU-based metrics. When applied to the Reading with Intent task, neutralization yields a average improvement on sarcastic but factually accurate contexts, while gains on fact-distorted sarcastic data are negligible; overall, the approach shows promise but also highlights limitations in handling mixed sarcasm and content deception in real-world settings.

Abstract

Queries to large language models (LLMs) can be divided into two parts: the instruction/question and the accompanying context. The context for retrieval-augmented generation (RAG) systems in most benchmarks comes from Wikipedia or Wikipedia-like texts which are written in a neutral and factual tone. However, when RAG systems retrieve internet-based content, they encounter text with diverse tones and linguistic styles, introducing challenges for downstream tasks. The Reading with Intent task addresses this issue by evaluating how varying tones in context passages affect model performance. Building on prior work that focused on sarcasm, we extend this paradigm by constructing a dataset where context passages are transformed to distinct emotions using a better synthetic data generation approach. Using this dataset, we train an emotion translation model to systematically adapt passages to specified emotional tones. The human evaluation shows that the LLM fine-tuned to become the emotion-translator benefited from the synthetically generated data. Finally, the emotion-translator is used in the Reading with Intent task to transform the passages to a neutral tone. By neutralizing the passages, it mitigates the challenges posed by sarcastic passages and improves overall results on this task by about .
Paper Structure (10 sections, 8 figures, 3 tables)

This paper contains 10 sections, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Synthetic data generation process.
  • Figure 2: The KL-Divergences between the unigram, bigrams, and trigrams of the original and synthetic datasets.
  • Figure 3: The average length of the passages from each model and overall.
  • Figure 4: Human evaluation of the emotional reconstruction of the human-written text as compared to the original text.
  • Figure 5: Human evaluation of the emotional reconstruction of the human-written text as compared to the emotional reconstruction of the text by an un-fine-tuned LLM.
  • ...and 3 more figures