Table of Contents
Fetching ...

Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation

Mingzhe Li, XieXiong Lin, Xiuying Chen, Jinxiong Chang, Qishen Zhang, Feng Wang, Taifeng Wang, Zhongyi Liu, Wei Chu, Dongyan Zhao, Rui Yan

TL;DR

This work tackles exposure bias and inadequate word-level guidance in text generation by introducing a hierarchical contrastive learning framework built on CVAE. It combines instance-level KL-based distribution alignment, a keyword-graph-driven keyword-level contrast, and a Mahalanobis inter-contrast that ties instance and keyword representations through a distribution-aware metric. The approach is instantiated with a keyword graph to polish keyword representations and an inter-level loss to mitigate contrast vanishing, yielding improvements across paraphrasing, dialogue, and storytelling tasks on QQP, Douban, and RocStories. Empirical results from automatic metrics and human judgments demonstrate the method’s effectiveness and robustness, with ablations confirming the necessity of each component. Overall, the paper presents a principled, distribution-aware, multi-granularity contrastive framework that enhances controllable text generation and semantic fidelity in multiple generation domains.

Abstract

Contrastive learning has achieved impressive success in generation tasks to militate the "exposure bias" problem and discriminatively exploit the different quality of references. Existing works mostly focus on contrastive learning on the instance-level without discriminating the contribution of each word, while keywords are the gist of the text and dominant the constrained mapping relationships. Hence, in this work, we propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text. Concretely, we first propose a keyword graph via contrastive correlations of positive-negative pairs to iteratively polish the keyword representations. Then, we construct intra-contrasts within instance-level and keyword-level, where we assume words are sampled nodes from a sentence distribution. Finally, to bridge the gap between independent contrast levels and tackle the common contrast vanishing problem, we propose an inter-contrast mechanism that measures the discrepancy between contrastive keyword nodes respectively to the instance distribution. Experiments demonstrate that our model outperforms competitive baselines on paraphrasing, dialogue generation, and storytelling tasks.

Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation

TL;DR

This work tackles exposure bias and inadequate word-level guidance in text generation by introducing a hierarchical contrastive learning framework built on CVAE. It combines instance-level KL-based distribution alignment, a keyword-graph-driven keyword-level contrast, and a Mahalanobis inter-contrast that ties instance and keyword representations through a distribution-aware metric. The approach is instantiated with a keyword graph to polish keyword representations and an inter-level loss to mitigate contrast vanishing, yielding improvements across paraphrasing, dialogue, and storytelling tasks on QQP, Douban, and RocStories. Empirical results from automatic metrics and human judgments demonstrate the method’s effectiveness and robustness, with ablations confirming the necessity of each component. Overall, the paper presents a principled, distribution-aware, multi-granularity contrastive framework that enhances controllable text generation and semantic fidelity in multiple generation domains.

Abstract

Contrastive learning has achieved impressive success in generation tasks to militate the "exposure bias" problem and discriminatively exploit the different quality of references. Existing works mostly focus on contrastive learning on the instance-level without discriminating the contribution of each word, while keywords are the gist of the text and dominant the constrained mapping relationships. Hence, in this work, we propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text. Concretely, we first propose a keyword graph via contrastive correlations of positive-negative pairs to iteratively polish the keyword representations. Then, we construct intra-contrasts within instance-level and keyword-level, where we assume words are sampled nodes from a sentence distribution. Finally, to bridge the gap between independent contrast levels and tackle the common contrast vanishing problem, we propose an inter-contrast mechanism that measures the discrepancy between contrastive keyword nodes respectively to the instance distribution. Experiments demonstrate that our model outperforms competitive baselines on paraphrasing, dialogue generation, and storytelling tasks.
Paper Structure (35 sections, 8 equations, 3 figures, 4 tables)

This paper contains 35 sections, 8 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: The semantic meaning of the sentence "what are the best books on cosmology?" would be greatly changed if the keyword "cosmology" is changed to "astrophysic".
  • Figure 2: The architecture of hierarchical contrastive learning, which consists of three parts: (1) Keyword-level contrast from keyword graph; (2) Instance-level contrast based on KL divergence for semantic distribution; and (3) Mahalanobis contrast between instance-level and keyword-level.
  • Figure 3: Visualization of contrastive learning. The square, circle and triangle represents the input text, positive output sample, and negative output sample, respectively. Blue represents the sentence, and yellow represents the keyword.