Table of Contents
Fetching ...

Text Generation from Knowledge Graphs with Graph Transformers

Rik Koncel-Kedziorski, Dhanush Bekal, Yi Luan, Mirella Lapata, Hannaneh Hajishirzi

TL;DR

This work tackles generating coherent multi-sentence scientific text from automatically extracted knowledge graphs. It introduces GraphWriter, a Graph Transformer–based encoder that processes labeled knowledge graphs without linearization and a copy-enabled decoder that jointly attends to graph and title inputs. Empirical results on the AGENDA dataset show GraphWriter outperforms graph- and non-graph–based baselines in both automatic metrics and human judgments, with notable gains in document structure and informativeness. The study provides a new resource and highlights the value of graph-structured knowledge for improving text generation in scientific domains.

Abstract

Generating texts which express complex ideas spanning multiple sentences requires a structured representation of their content (document plan), but these representations are prohibitively expensive to manually produce. In this work, we address the problem of generating coherent multi-sentence texts from the output of an information extraction system, and in particular a knowledge graph. Graphical knowledge representations are ubiquitous in computing, but pose a significant challenge for text generation techniques due to their non-hierarchical nature, collapsing of long-distance dependencies, and structural variety. We introduce a novel graph transforming encoder which can leverage the relational structure of such knowledge graphs without imposing linearization or hierarchical constraints. Incorporated into an encoder-decoder setup, we provide an end-to-end trainable system for graph-to-text generation that we apply to the domain of scientific text. Automatic and human evaluations show that our technique produces more informative texts which exhibit better document structure than competitive encoder-decoder methods.

Text Generation from Knowledge Graphs with Graph Transformers

TL;DR

This work tackles generating coherent multi-sentence scientific text from automatically extracted knowledge graphs. It introduces GraphWriter, a Graph Transformer–based encoder that processes labeled knowledge graphs without linearization and a copy-enabled decoder that jointly attends to graph and title inputs. Empirical results on the AGENDA dataset show GraphWriter outperforms graph- and non-graph–based baselines in both automatic metrics and human judgments, with notable gains in document structure and informativeness. The study provides a new resource and highlights the value of graph-structured knowledge for improving text generation in scientific domains.

Abstract

Generating texts which express complex ideas spanning multiple sentences requires a structured representation of their content (document plan), but these representations are prohibitively expensive to manually produce. In this work, we address the problem of generating coherent multi-sentence texts from the output of an information extraction system, and in particular a knowledge graph. Graphical knowledge representations are ubiquitous in computing, but pose a significant challenge for text generation techniques due to their non-hierarchical nature, collapsing of long-distance dependencies, and structural variety. We introduce a novel graph transforming encoder which can leverage the relational structure of such knowledge graphs without imposing linearization or hierarchical constraints. Incorporated into an encoder-decoder setup, we provide an end-to-end trainable system for graph-to-text generation that we apply to the domain of scientific text. Automatic and human evaluations show that our technique produces more informative texts which exhibit better document structure than competitive encoder-decoder methods.

Paper Structure

This paper contains 18 sections, 6 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: A scientific text showing the annotations of an information extraction system and the corresponding graphical representation. Coreference annotations shown in color. Our model learns to generate texts from automatically extracted knowledge using a graph encoder decoder setup.
  • Figure 2: Converting disconnected labeled graph to connected unlabeled graph for use in attention-based encoder. $v_i$ refer to vertices, $R_{ij}$ to relations, and $G$ is a global context node.
  • Figure 3: GraphWriter Model Overview
  • Figure 4: Graph Transformer