Trace and Edit Relation Associations in GPT

Jiahang Li; Taoyu Chen; Yuanli Wang

Trace and Edit Relation Associations in GPT

Jiahang Li, Taoyu Chen, Yuanli Wang

TL;DR

The paper tackles where entity-relational knowledge resides in GPT-like transformers and how it can be edited. It introduces relation tracing to identify critical early- and late-layer MLP components and uses causal mediation analysis, augmented with counterfactual data, to understand how relations are stored and recalled, benchmarking against ROME on FewRel. A modified ROME approach targets a rank-one update in the fifth MLP layer, yielding improved generalization and specificity in relation edits, with paraphrase-success rising to 41.07% from 40.71%. The work demonstrates the feasibility of precise, layer-targeted edits in relational knowledge, offering implications for controlled knowledge manipulation in language models and highlighting avenues for future architectural and methodological refinements.

Abstract

This study introduces a novel approach for analyzing and modifying entity relationships in GPT models, diverging from ROME's entity-focused methods. We develop a relation tracing technique to understand the influence of language model computations on relationship judgments. Using the FewRel dataset, we identify key roles of MLP modules and attention mechanisms in processing relationship information. Our method, tested against ROME on a new dataset, shows improved balance in specificity and generalization, underscoring the potential of manipulating early-layer modules for enhanced model understanding and accuracy.

Trace and Edit Relation Associations in GPT

TL;DR

Abstract

Paper Structure (14 sections, 4 figures, 3 tables)

This paper contains 14 sections, 4 figures, 3 tables.

Introduction
Related work
Problem formulation
Methods
Relation Tracing
Editing Relation in GPT
Dataset
Relation Extraction Migration
Dataset for evaluation
Results
Confirming the Importance of Decisive States Identified by Relation Tracing
Comparing Generation Results
Conclusion
Future Work

Figures (4)

Figure 1: Data item for evaluation on modifying different layer of MLP
Figure 2: The relation impact on output probability
Figure 3: Relation distribution in the dataset for evaluation
Figure 4: Visualized Performance and variance after modifying corresponding layer of MLP in GPT

Trace and Edit Relation Associations in GPT

TL;DR

Abstract

Trace and Edit Relation Associations in GPT

Authors

TL;DR

Abstract

Table of Contents

Figures (4)