Memory-Augmented Generative Adversarial Transformers

Stephan Raaijmakers; Roos Bakker; Anita Cremers; Roy de Kleijn; Tom Kouwenhoven; Tessa Verhoef

Memory-Augmented Generative Adversarial Transformers

Stephan Raaijmakers, Roos Bakker, Anita Cremers, Roy de Kleijn, Tom Kouwenhoven, Tessa Verhoef

TL;DR

This paper proposes to extend the standard Transformer architecture with an additional memory bank holding extra information, and an extra attention layer for addressing this memory, and adds this augmented memory to a Generative Adversarial Network-inspired Transformer architecture.

Abstract

Conversational AI systems that rely on Large Language Models, like Transformers, have difficulty interweaving external data (like facts) with the language they generate. Vanilla Transformer architectures are not designed for answering factual questions with high accuracy. This paper investigates a possible route for addressing this problem. We propose to extend the standard Transformer architecture with an additional memory bank holding extra information (such as facts drawn from a knowledge base), and an extra attention layer for addressing this memory. We add this augmented memory to a Generative Adversarial Network-inspired Transformer architecture. This setup allows for implementing arbitrary felicity conditions on the generated language of the Transformer. We first demonstrate how this machinery can be deployed for handling factual questions in goal-oriented dialogues. Secondly, we demonstrate that our approach can be useful for applications like {\it style adaptation} as well: the adaptation of utterances according to certain stylistic (external) constraints, like social properties of human interlocutors in dialogues.

Memory-Augmented Generative Adversarial Transformers

TL;DR

Abstract

Paper Structure (11 sections, 11 equations, 4 figures, 10 tables)

This paper contains 11 sections, 11 equations, 4 figures, 10 tables.

Introduction
Generative Adversarial Transformers
Memory-augmented Conditional Generative Adversarial Transformers
Related work
Experiments
Data
Experimental conditions
Results
Discussion
Future work
Conclusion

Figures (4)

Figure 1: The memory-augmented Transformer with its encoder (left block) and decoder (right block).
Figure 2: The conditional Generative Adversarial Transformer (GAT), built up from two memory-augmented Transformers. The generator is equipped with additional loss functions conditioning its output. Notice how the generator receives summed losses from the additional loss functions ("Conditions") and the discriminator ("D-loss").
Figure 3: Setup for factual question answering. Three separate memory-augmented Transformers, each trained with their own conditional GAT, address different stages in the process.
Figure 4: Sample Elastic query.

Memory-Augmented Generative Adversarial Transformers

TL;DR

Abstract

Memory-Augmented Generative Adversarial Transformers

Authors

TL;DR

Abstract

Table of Contents

Figures (4)