Table of Contents
Fetching ...

Bilinear relational structure fixes reversal curse and enables consistent model editing

Dong-Kyum Kim, Minsung Kim, Jea Kwon, Nakyeong Yang, Meeyoung Cha

TL;DR

The paper reframes the reversal curse as a consequence of internal relational geometry rather than a fundamental limitation, arguing that a bilinear relational structure with relation-specific matrices $M_r$ can enable symmetric inference and consistent model editing. By training decoder-only transformers from scratch on a carefully designed synthetic relational knowledge graph, the authors show that regularization can induce a robust bilinear representation, which is validated by targeted probes and algebraic tests for inversion and composition. Crucially, they demonstrate a strong link between the presence of this bilinear structure and the model's ability to propagate edits to entailed facts, suggesting that editing success depends on representational geometry as much as on editing algorithms. The findings imply that preparing models with structured knowledge representations could yield more reliable, logically consistent LMs, while also highlighting safety considerations due to potential cascading generalizations.

Abstract

The reversal curse -- a language model's (LM) inability to infer an unseen fact ``B is A'' from a learned fact ``A is B'' -- is widely considered a fundamental limitation. We show that this is not an inherent failure but an artifact of how models encode knowledge. By training LMs from scratch on a synthetic dataset of relational knowledge graphs, we demonstrate that bilinear relational structure emerges in their hidden representations. This structure substantially alleviates the reversal curse, enabling LMs to infer unseen reverse facts. Crucially, we also find that this bilinear structure plays a key role in consistent model editing. When a fact is updated in a LM with this structure, the edit correctly propagates to its reverse and other logically dependent facts. In contrast, models lacking this representation not only suffer from the reversal curse but also fail to generalize edits, further introducing logical inconsistencies. Our results establish that training on a relational knowledge dataset induces the emergence of bilinear internal representations, which in turn enable LMs to behave in a logically consistent manner after editing. This implies that the success of model editing depends critically not just on editing algorithms but on the underlying representational geometry of the knowledge being modified.

Bilinear relational structure fixes reversal curse and enables consistent model editing

TL;DR

The paper reframes the reversal curse as a consequence of internal relational geometry rather than a fundamental limitation, arguing that a bilinear relational structure with relation-specific matrices can enable symmetric inference and consistent model editing. By training decoder-only transformers from scratch on a carefully designed synthetic relational knowledge graph, the authors show that regularization can induce a robust bilinear representation, which is validated by targeted probes and algebraic tests for inversion and composition. Crucially, they demonstrate a strong link between the presence of this bilinear structure and the model's ability to propagate edits to entailed facts, suggesting that editing success depends on representational geometry as much as on editing algorithms. The findings imply that preparing models with structured knowledge representations could yield more reliable, logically consistent LMs, while also highlighting safety considerations due to potential cascading generalizations.

Abstract

The reversal curse -- a language model's (LM) inability to infer an unseen fact ``B is A'' from a learned fact ``A is B'' -- is widely considered a fundamental limitation. We show that this is not an inherent failure but an artifact of how models encode knowledge. By training LMs from scratch on a synthetic dataset of relational knowledge graphs, we demonstrate that bilinear relational structure emerges in their hidden representations. This structure substantially alleviates the reversal curse, enabling LMs to infer unseen reverse facts. Crucially, we also find that this bilinear structure plays a key role in consistent model editing. When a fact is updated in a LM with this structure, the edit correctly propagates to its reverse and other logically dependent facts. In contrast, models lacking this representation not only suffer from the reversal curse but also fail to generalize edits, further introducing logical inconsistencies. Our results establish that training on a relational knowledge dataset induces the emergence of bilinear internal representations, which in turn enable LMs to behave in a logically consistent manner after editing. This implies that the success of model editing depends critically not just on editing algorithms but on the underlying representational geometry of the knowledge being modified.

Paper Structure

This paper contains 41 sections, 12 equations, 11 figures.

Figures (11)

  • Figure 1: Schematics of the three relational embedding structures examined in our study. Given a subject $s$ and an object $o$, the relation $r$ can be represented as (a) a linear transformation, (b) a vector translation, or (c) a bilinear interaction mediated by a relation-specific matrix $M_r$. In a fact "son of Tracy Lance Smith is Cory Lance Smith," the subject $s$ is "Tracy Lance Smith,” the object $o$ is "Cory Lance Smith," and the relation $r$ is son.
  • Figure 2: (Left) A schematic of the synthetic family knowledge graph used for experiments. Nodes represent entities, and edges represent one of eight relations. (Right) Test accuracy on the unseen relations (mother/father) as a function of weight decay. Each weight decay setting was trained using three different random seeds.
  • Figure 3: Layer-wise averaged accuracy of relational embedding probes across all relations for "Reversal Cursed" models (blue) and "Not Reversal Cursed" models (orange).
  • Figure 4: Performance on relational algebra tasks. Top Row (Composition): Accuracy of inferring a composed relation using the product of the corresponding bilinear matrices (e.g., $M_{\text{husband}} \cdot M_{\text{mother}}$ to probe for 'father'). Bottom Row (Transpose): Accuracy of inferring an inverse relation using the transpose of a bilinear matrix (e.g., $M_{\text{husband}}^\top$ to probe for 'wife').
  • Figure 5: Model editing generalization and its link to bilinear structure. (a) Schematic of the editing task. The fact (A, husband, B) is edited to (A, husband, B'). A successful logical generalization updates the inverse (B', wife, A) and neighborhood relations (C/D, father, B'); (B', daughter/son, C/D). (b) Performance after editing the target layer: Edit Success (direct change), Logical Generalization (propagation to entailed facts), and Locality (impact on unrelated facts). (c) A strong correlation ($R^2=0.939$) exists between a model's best bilinear accuracy and its best logical generalization after editing. (d) Layer-wise performance of bilinear probing and logical generalization for "Not Reversal Cursed" models.
  • ...and 6 more figures