Table of Contents
Fetching ...

IFDID: Information Filter upon Diversity-Improved Decoding for Diversity-Faithfulness Tradeoff in NLG

Han Meng, Xiaosong He, Zexing Chen, Feng Zhou

TL;DR

The authors' approach achieves a 1.24 higher ROUGE score describing faithfulness as well as higher diversity represented by 62.5% higher upon Dist-2 than traditional approaches, demonstrating that IFDID is a novel SOTA decoding strategy for the tradeoff between diversity and faithfulness.

Abstract

Some Natural Language Generation (NLG) tasks require both faithfulness and diversity. The decoding strategy is intensively related to the quality of the generated text. Strategies such as beam search, greedy search, etc., perform with low diversity and high repetition. On the other hand, guided decoding, the solution towards diversity, may generate unfaithful expressions. To this end, this paper presents Information Filter upon Diversity-Improved Decoding (IFDID) to obtain the tradeoff between diversity and faithfulness. IFDID is a two-stage decoding strategy leveraging the proposed Enhance-Filter framework, which achieves the tradeoff by increasing the probabilities of some typical tokens being selected and subsequently filtering them by their information amount. To verify the effectiveness, we compare our method with other baselines on related CommonGEN, RocStories and AdGen benchmarks, which cover Chinese and English datasets. Our numerical experimental results and human evaluation outcomes verify the effectiveness of the proposed approach, as our approach achieves a 1.24 higher ROUGE score describing faithfulness as well as higher diversity represented by 62.5% higher upon Dist-2 than traditional approaches, demonstrating that IFDID is a novel SOTA decoding strategy for the tradeoff between diversity and faithfulness.

IFDID: Information Filter upon Diversity-Improved Decoding for Diversity-Faithfulness Tradeoff in NLG

TL;DR

The authors' approach achieves a 1.24 higher ROUGE score describing faithfulness as well as higher diversity represented by 62.5% higher upon Dist-2 than traditional approaches, demonstrating that IFDID is a novel SOTA decoding strategy for the tradeoff between diversity and faithfulness.

Abstract

Some Natural Language Generation (NLG) tasks require both faithfulness and diversity. The decoding strategy is intensively related to the quality of the generated text. Strategies such as beam search, greedy search, etc., perform with low diversity and high repetition. On the other hand, guided decoding, the solution towards diversity, may generate unfaithful expressions. To this end, this paper presents Information Filter upon Diversity-Improved Decoding (IFDID) to obtain the tradeoff between diversity and faithfulness. IFDID is a two-stage decoding strategy leveraging the proposed Enhance-Filter framework, which achieves the tradeoff by increasing the probabilities of some typical tokens being selected and subsequently filtering them by their information amount. To verify the effectiveness, we compare our method with other baselines on related CommonGEN, RocStories and AdGen benchmarks, which cover Chinese and English datasets. Our numerical experimental results and human evaluation outcomes verify the effectiveness of the proposed approach, as our approach achieves a 1.24 higher ROUGE score describing faithfulness as well as higher diversity represented by 62.5% higher upon Dist-2 than traditional approaches, demonstrating that IFDID is a novel SOTA decoding strategy for the tradeoff between diversity and faithfulness.
Paper Structure (26 sections, 10 equations, 5 figures, 4 tables)

This paper contains 26 sections, 10 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Features of beam search, sampling, guided decoding and proposed decoding strategies in CommonGEN commongen task. The output of beam search and sampling method is lack of diversity but faithful, whereas result generated by guided decoding strategies is diverse but fail to cover all information in concept set. IFDID, the proposed approach, successfully strike balance between faithfulness and diversity.
  • Figure 2: Overview of IFDID.
  • Figure 3: Overview of IFDID-SIMI. It is a practical implement of Enhance stage in IFDID. The modification principle is based on word embedding similarity.
  • Figure 4: The performance of IFDID or IFDID-SIMI in terms of diversity (represented by Dist and Uniq) and faithfulness (represented by BLEU) under different parameter settings. X-axis is the controlled parameters, whereas y-axis is the evaluation metrics.
  • Figure 5: The Rep-2 metrics (the statistic of duplicate 2-gram) of different decoding strategy under the condition of various max sequence length. The text that exceeds the maximum distance is truncated.