Table of Contents
Fetching ...

Knowledge Graphs as Structured Memory for Embedding Spaces: From Training Clusters to Explainable Inference

Artur A. Oliveira, Mateus Espadoto, Roberto M. Cesar, Roberto Hirata

TL;DR

Graph Memory (GM) addresses the lack of relational structure in embedding-based inference by building a reusable prototype-graph memory where each node summarizes a region of the embedding space and carries reliability indicators, while edges encode geometric and contextual relations. Inference is performed by a diffusion process over this graph, yielding calibrated, region-aware predictions and localized explanations that highlight which prototypes supported a decision. GM unifies non-parametric retrieval, prototype-based interpretability, and graph-based semi-supervised reasoning, and can be attached to any fixed encoder in an inductive fashion. Empirically, GM achieves accuracy competitive with $k$NN and Label Spreading while delivering markedly better calibration and smoother decision boundaries, using an order of magnitude fewer samples, and it demonstrates strong performance on both synthetic data and breast histopathology (IDC). This framework provides a principled bridge between local evidence and global consistency, enabling scalable, explainable non-parametric inference with compact prototype memory.

Abstract

We introduce Graph Memory (GM), a structured non-parametric framework that augments embedding-based inference with a compact, relational memory over region-level prototypes. Rather than treating each training instance in isolation, GM summarizes the embedding space into prototype nodes annotated with reliability indicators and connected by edges that encode geometric and contextual relations. This design unifies instance retrieval, prototype-based reasoning, and graph-based label propagation within a single inductive model that supports both efficient inference and faithful explanation. Experiments on synthetic and real datasets including breast histopathology (IDC) show that GM achieves accuracy competitive with $k$NN and Label Spreading while offering substantially better calibration and smoother decision boundaries, all with an order of magnitude fewer samples. By explicitly modeling reliability and relational structure, GM provides a principled bridge between local evidence and global consistency in non-parametric learning.

Knowledge Graphs as Structured Memory for Embedding Spaces: From Training Clusters to Explainable Inference

TL;DR

Graph Memory (GM) addresses the lack of relational structure in embedding-based inference by building a reusable prototype-graph memory where each node summarizes a region of the embedding space and carries reliability indicators, while edges encode geometric and contextual relations. Inference is performed by a diffusion process over this graph, yielding calibrated, region-aware predictions and localized explanations that highlight which prototypes supported a decision. GM unifies non-parametric retrieval, prototype-based interpretability, and graph-based semi-supervised reasoning, and can be attached to any fixed encoder in an inductive fashion. Empirically, GM achieves accuracy competitive with NN and Label Spreading while delivering markedly better calibration and smoother decision boundaries, using an order of magnitude fewer samples, and it demonstrates strong performance on both synthetic data and breast histopathology (IDC). This framework provides a principled bridge between local evidence and global consistency, enabling scalable, explainable non-parametric inference with compact prototype memory.

Abstract

We introduce Graph Memory (GM), a structured non-parametric framework that augments embedding-based inference with a compact, relational memory over region-level prototypes. Rather than treating each training instance in isolation, GM summarizes the embedding space into prototype nodes annotated with reliability indicators and connected by edges that encode geometric and contextual relations. This design unifies instance retrieval, prototype-based reasoning, and graph-based label propagation within a single inductive model that supports both efficient inference and faithful explanation. Experiments on synthetic and real datasets including breast histopathology (IDC) show that GM achieves accuracy competitive with NN and Label Spreading while offering substantially better calibration and smoother decision boundaries, all with an order of magnitude fewer samples. By explicitly modeling reliability and relational structure, GM provides a principled bridge between local evidence and global consistency in non-parametric learning.

Paper Structure

This paper contains 27 sections, 17 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Decision regions on synthetic data. GM yields flatter, reliability-weighted boundaries while retaining prototype-level interpretability.
  • Figure 2: Dirichlet energy maps (visualized as gradient-magnitude fields, $\|\nabla p(y{=}1\mid x)\|$). Lower values indicate smoother and more stable decision transitions. GM produces lower-energy, reliability-weighted boundaries and avoids the sample-hugging artifacts observed in Label Spreading and $k$NN, particularly near class interfaces.
  • Figure 3: Representative benign tissue patches from the IDC dataset.
  • Figure 4: Representative malignant tissue patches from the IDC dataset.