Attention's Gravitational Field:A Power-Law Interpretation of Positional Correlation

Edward Zhang

Attention's Gravitational Field:A Power-Law Interpretation of Positional Correlation

Edward Zhang

TL;DR

By decoupling positional encodings from semantic embeddings from semantic embeddings, the model architecture is optimized and the concept of the Attention Gravitational Field (AGF) is introduced, demonstrating its intrinsic consistency with learning and stability curves.

Abstract

This paper explores the underlying principles of positional relationships and encodings within Large Language Models (LLMs) and introduces the concept of the Attention Gravitational Field (AGF). By decoupling positional encodings from semantic embeddings, we optimize the model architecture and achieve superior accuracy compared to prevailing encoding methods. Furthermore, we provide an in-depth analysis of AGF, demonstrating its intrinsic consistency with learning and stability curves, as well as its empirical alignment with Newton's Law of Universal Gravitation. By offering a rigorous theoretical exploration of these phenomena, this work represents a significant step toward interpreting the Attention mechanism and unlocks new possibilities for future research in model optimization and interpretability.

Attention's Gravitational Field:A Power-Law Interpretation of Positional Correlation

TL;DR

Abstract

Paper Structure (17 sections, 19 equations, 8 figures, 6 tables)

This paper contains 17 sections, 19 equations, 8 figures, 6 tables.

Introduction
Methodology and Background
Positional Correlation
Decomposition
Directionality
Component Analysis (LC 1-3)
Attention's Gravitational Field
Benefits of Decoupling
PCM-V: Positional Coefficient Multiplication of $V$
Other Tricks
Why AGF Works!
What is Attention?
Power-Law
Gravitational Field
The Expanding Sphere Model
...and 2 more sections

Figures (8)

Figure 1: Decomposition
Figure 2: Part-of-Speech VS Attention
Figure 3: Frequency Distribution of Words Following 'beautiful'
Figure 4: Power vs Exp
Figure 5: Learning Curve
...and 3 more figures

Attention's Gravitational Field:A Power-Law Interpretation of Positional Correlation

TL;DR

Abstract

Attention's Gravitational Field:A Power-Law Interpretation of Positional Correlation

Authors

TL;DR

Abstract

Table of Contents

Figures (8)