Table of Contents
Fetching ...

$R^2$-CoD: Understanding Text-Graph Complementarity in Relational Reasoning via Knowledge Co-Distillation

Zhen Wu, Ritam Dutt, Luke M. Breitfeller, Armineh Nourbakhsh, Siddharth Parekh, Carolyn Rosé

TL;DR

This work analyzes how text and graph representations complement each other in relational reasoning by introducing a unified analysis framework (R2-CoD) that uses knowledge co-distillation to jointly train text and graph encoders. Across five diverse tasks, the authors show that CoD generally improves performance and reveal a spectrum of interaction patterns from complete alignment to persistent complementarity, depending on task structure, graph-text correspondence, and the level of reasoning. They provide a methodological toolkit—PCA visualizations and distance/ cosine metrics—to characterize when modalities align or diverge, offering practical guidance for designing hybrid text-graph models in structured NLP tasks like ETRE, MLRE, FU, KBQA, and RPP. The findings advance understanding of multimodal representation learning and inform when to leverage joint text-graph signals to boost relational reasoning systems.

Abstract

Relational reasoning lies at the core of many NLP tasks, drawing on complementary signals from text and graphs. While prior research has investigated how to leverage this dual complementarity, a detailed and systematic understanding of text-graph interplay and its effect on hybrid models remains underexplored. We take an analysis-driven approach to investigate text-graph representation complementarity via a unified architecture that supports knowledge co-distillation (CoD). We explore five tasks involving relational reasoning that differ in how text and graph structures encode the information needed to solve that task. By tracking how these dual representations evolve during training, we uncover interpretable patterns of alignment and divergence, and provide insights into when and why their integration is beneficial.

$R^2$-CoD: Understanding Text-Graph Complementarity in Relational Reasoning via Knowledge Co-Distillation

TL;DR

This work analyzes how text and graph representations complement each other in relational reasoning by introducing a unified analysis framework (R2-CoD) that uses knowledge co-distillation to jointly train text and graph encoders. Across five diverse tasks, the authors show that CoD generally improves performance and reveal a spectrum of interaction patterns from complete alignment to persistent complementarity, depending on task structure, graph-text correspondence, and the level of reasoning. They provide a methodological toolkit—PCA visualizations and distance/ cosine metrics—to characterize when modalities align or diverge, offering practical guidance for designing hybrid text-graph models in structured NLP tasks like ETRE, MLRE, FU, KBQA, and RPP. The findings advance understanding of multimodal representation learning and inform when to leverage joint text-graph signals to boost relational reasoning systems.

Abstract

Relational reasoning lies at the core of many NLP tasks, drawing on complementary signals from text and graphs. While prior research has investigated how to leverage this dual complementarity, a detailed and systematic understanding of text-graph interplay and its effect on hybrid models remains underexplored. We take an analysis-driven approach to investigate text-graph representation complementarity via a unified architecture that supports knowledge co-distillation (CoD). We explore five tasks involving relational reasoning that differ in how text and graph structures encode the information needed to solve that task. By tracking how these dual representations evolve during training, we uncover interpretable patterns of alignment and divergence, and provide insights into when and why their integration is beneficial.

Paper Structure

This paper contains 44 sections, 7 equations, 17 figures, 9 tables.

Figures (17)

  • Figure 1: Task spectrum of representation relationships. Left: they remain distinct and complementary. Middle: they show some similarity but do not fully align. Right: they converge toward aligned representations. This spectrum motivates our task selection for analysis.
  • Figure 2: Our unified framework for analyzing how text and graph representations complement each other. A text sequence and its corresponding graph are processed by separate encoders. Their outputs are used in two ways: (1) combined as hybrid inputs for task prediction, and (2) projected into a shared space where a contrastive co-distillation (CoD) objective encourages mutual learning and enables representation-level analysis.
  • Figure 3: Results for ETRE on the TDDMan dataset. PCA visualizations (top) across training stages, and corresponding distance-based metrics (bottom). The text and graph representations remain well-separated, and the between-group distance remains consistently higher than the within-group distances.
  • Figure 4: Results for reasoning pattern prediction on the WebQSP dataset. The text and graph representations move closer but stay largely separable. The between-group distance increases during training.
  • Figure 5: Results for form understanding on the CORD dataset.The text and graph representations draw closer and form overlapping clusters during training, and the between-group distance decreases and eventually approaches the within-group distances.
  • ...and 12 more figures