Probing Omissions and Distortions in Transformer-based RDF-to-Text Models

Juliette Faille; Albert Gatt; Claire Gardent

Probing Omissions and Distortions in Transformer-based RDF-to-Text Models

Juliette Faille, Albert Gatt, Claire Gardent

TL;DR

It is found that both omitted and distorted entities can be probed in the encoder's output embeddings, suggesting that the encoder emits a weaker signal for these entities and therefore is responsible for some loss of information.

Abstract

In Natural Language Generation (NLG), important information is sometimes omitted in the output text. To better understand and analyse how this type of mistake arises, we focus on RDF-to-Text generation and explore two methods of probing omissions in the encoder output of BART (Lewis et al, 2020) and of T5 (Raffel et al, 2019): (i) a novel parameter-free probing method based on the computation of cosine similarity between embeddings of RDF graphs and of RDF graphs in which we removed some entities and (ii) a parametric probe which performs binary classification on the encoder embeddings to detect omitted entities. We also extend our analysis to distorted entities, i.e. entities that are not fully correctly mentioned in the generated text (e.g. misspelling of entity, wrong units of measurement). We found that both omitted and distorted entities can be probed in the encoder's output embeddings. This suggests that the encoder emits a weaker signal for these entities and therefore is responsible for some loss of information. This also shows that probing methods can be used to detect mistakes in the output of NLG models.

Probing Omissions and Distortions in Transformer-based RDF-to-Text Models

TL;DR

Abstract

Paper Structure (42 sections, 4 equations, 1 figure, 13 tables)

This paper contains 42 sections, 4 equations, 1 figure, 13 tables.

Introduction
Related Work
Probing of Pre-trained Models
Content-related issues in text generation
NLG Models and Annotated Data
Generation Model
(RDF,Text) Data
Annotated Data
Data for the probing experiments
Evaluation of Automatic Annotation
Exploring the possible role of decoding strategies
Method
Parameter-free Probing
Results
Parametric Probing: Binary classifiers
...and 27 more sections

Figures (1)

Figure 1: Example of an RDF input and Generated Text with corresponding results of the automatic entity detection, and manual annotations of omissions and distortions

Probing Omissions and Distortions in Transformer-based RDF-to-Text Models

TL;DR

Abstract

Probing Omissions and Distortions in Transformer-based RDF-to-Text Models

Authors

TL;DR

Abstract

Table of Contents

Figures (1)