Table of Contents
Fetching ...

Implementing Derivations of Definite Logic Programs with Self-Attention Networks

Phan Thi Thanh Thuy, Akihiro Yamamoto

TL;DR

The results show that LLMs implicitly have the power of logical inference, and it is shown that hierarchical constructions of self-attention networks with feed forward networks (FFNs) can implement top-down derivations for a class of logical formulae.

Abstract

In this paper we propose that a restricted version of logical inference can be implemented with self-attention networks. We are aiming at showing that LLMs (Large Language Models) constructed with transformer networks can make logical inferences. We would reveal the potential of LLMs by analyzing self-attention networks, which are main components of transformer networks. Our approach is not based on semantics of natural languages but operations of logical inference. %point of view. We show that hierarchical constructions of self-attention networks with feed forward networks (FFNs) can implement top-down derivations for a class of logical formulae. We also show bottom-up derivations are also implemented for the same class. We believe that our results show that LLMs implicitly have the power of logical inference.

Implementing Derivations of Definite Logic Programs with Self-Attention Networks

TL;DR

The results show that LLMs implicitly have the power of logical inference, and it is shown that hierarchical constructions of self-attention networks with feed forward networks (FFNs) can implement top-down derivations for a class of logical formulae.

Abstract

In this paper we propose that a restricted version of logical inference can be implemented with self-attention networks. We are aiming at showing that LLMs (Large Language Models) constructed with transformer networks can make logical inferences. We would reveal the potential of LLMs by analyzing self-attention networks, which are main components of transformer networks. Our approach is not based on semantics of natural languages but operations of logical inference. %point of view. We show that hierarchical constructions of self-attention networks with feed forward networks (FFNs) can implement top-down derivations for a class of logical formulae. We also show bottom-up derivations are also implemented for the same class. We believe that our results show that LLMs implicitly have the power of logical inference.

Paper Structure

This paper contains 7 sections, 1 theorem, 40 equations, 1 figure.

Key Result

Proposition 1

Assume that the self-attention function employs the hardmax function and is followed by the dimension-wise Heaviside function $H$. Let $P$ be an SD-program and $\bm{q}$ is a vector representing a query $Q$. Then the output $H(\hbox{\em Attention}(\bm{q}, H_P, B_P))$ represents the query $Q'$ obtaine

Figures (1)

  • Figure 1: A successful top-down derivation

Theorems & Definitions (2)

  • Proposition 1
  • proof