Table of Contents
Fetching ...

TOBUGraph: Knowledge Graph-Based Retrieval for Enhanced LLM Performance Beyond RAG

Savini Kashmira, Jayanaka L. Dantanarayana, Joshua Brodsky, Ashish Mahendra, Yiping Kang, Krisztian Flautner, Lingjia Tang, Jason Mars

TL;DR

TOBUGraph addresses key failures of Retrieval-Augmented Generation by automatically constructing a knowledge graph from unstructured data and performing retrieval via graph traversal rather than text-to-text embedding similarity. The framework builds a Relational Memory Graph that encodes per-memory semantic nodes and cross-memory relationship nodes, enabling deep semantic connections and robust retrieval without heavy chunking configurations. Evaluation on real TOBU data shows TOBUGraph achieving superior precision, recall, and F1 scores, alongside strong user preference and reduced hallucinations. The work demonstrates practical impact for personal memory organization apps and outlines future work on scaling the graph to large memory collections.

Abstract

Retrieval-Augmented Generation (RAG) is one of the leading and most widely used techniques for enhancing LLM retrieval capabilities, but it still faces significant limitations in commercial use cases. RAG primarily relies on the query-chunk text-to-text similarity in the embedding space for retrieval and can fail to capture deeper semantic relationships across chunks, is highly sensitive to chunking strategies, and is prone to hallucinations. To address these challenges, we propose TOBUGraph, a graph-based retrieval framework that first constructs the knowledge graph from unstructured data dynamically and automatically. Using LLMs, TOBUGraph extracts structured knowledge and diverse relationships among data, going beyond RAG's text-to-text similarity. Retrieval is achieved through graph traversal, leveraging the extracted relationships and structures to enhance retrieval accuracy, eliminating the need for chunking configurations while reducing hallucination. We demonstrate TOBUGraph's effectiveness in TOBU, a real-world application in production for personal memory organization and retrieval. Our evaluation using real user data demonstrates that TOBUGraph outperforms multiple RAG implementations in both precision and recall, significantly improving user experience through improved retrieval accuracy.

TOBUGraph: Knowledge Graph-Based Retrieval for Enhanced LLM Performance Beyond RAG

TL;DR

TOBUGraph addresses key failures of Retrieval-Augmented Generation by automatically constructing a knowledge graph from unstructured data and performing retrieval via graph traversal rather than text-to-text embedding similarity. The framework builds a Relational Memory Graph that encodes per-memory semantic nodes and cross-memory relationship nodes, enabling deep semantic connections and robust retrieval without heavy chunking configurations. Evaluation on real TOBU data shows TOBUGraph achieving superior precision, recall, and F1 scores, alongside strong user preference and reduced hallucinations. The work demonstrates practical impact for personal memory organization apps and outlines future work on scaling the graph to large memory collections.

Abstract

Retrieval-Augmented Generation (RAG) is one of the leading and most widely used techniques for enhancing LLM retrieval capabilities, but it still faces significant limitations in commercial use cases. RAG primarily relies on the query-chunk text-to-text similarity in the embedding space for retrieval and can fail to capture deeper semantic relationships across chunks, is highly sensitive to chunking strategies, and is prone to hallucinations. To address these challenges, we propose TOBUGraph, a graph-based retrieval framework that first constructs the knowledge graph from unstructured data dynamically and automatically. Using LLMs, TOBUGraph extracts structured knowledge and diverse relationships among data, going beyond RAG's text-to-text similarity. Retrieval is achieved through graph traversal, leveraging the extracted relationships and structures to enhance retrieval accuracy, eliminating the need for chunking configurations while reducing hallucination. We demonstrate TOBUGraph's effectiveness in TOBU, a real-world application in production for personal memory organization and retrieval. Our evaluation using real user data demonstrates that TOBUGraph outperforms multiple RAG implementations in both precision and recall, significantly improving user experience through improved retrieval accuracy.

Paper Structure

This paper contains 21 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: (a) Memory capturing workflow and (b) Memory retrieving workflow in TOBUGraph framework implemented in TOBU app.
  • Figure 2: Distribution of evaluator preference of each approach, as probabilities. Among 480 human evaluators, TOBUGraph responses are preferred 75% of the time on average, when present as a response option in a pairwise comparison. Furthermore, the preference distribution for TOBUGraph has lower variance, indicating more consistent performance compared to other approaches.
  • Figure 3: Evaluator preferences for each approach, measured as probabilities across four categorization levels based on memory retrieval complexity and nature. TOBUGraph consistently achieves the highest preference among evaluators across all levels, outperforming other approaches regardless of question complexity.
  • Figure 4: Example conversations from the dataset discussed in section \ref{['subsec:dataset']} where (a) having issues I1 and I2, (b) representing hallucination as in I4 while (c) demonstrating issues I2 and I3 from Table \ref{['tab:qualitative-comparison']}.