Table of Contents
Fetching ...

SuperRAG: Beyond RAG with Layout-Aware Graph Modeling

Jeff Yang, Duy-Khanh Vu, Minh-Tien Nguyen, Xuan-Quang Nguyen, Linh Nguyen, Hung Le

TL;DR

This work addresses the challenge of multimodal document understanding within Retrieval Augmented Generation by introducing Layout-Aware Graph Modeling (LAGM), a graph-based representation that preserves document layout and the relationships among text, tables, and diagrams. The SuperRAG framework combines LAGM with flexible retrieval strategies (LLM-driven graph traversal and heuristic TOC/table/diagram reasoning) and graph augmentation to enable accurate, scalable multimodal QA. Through DOCBENCH and SPIQA evaluations, SuperRAG demonstrates significant improvements over non-layout RAG and strong baselines, validating the value of layout-aware structure for IR and reasoning. The proposed system is practical for business use, offering a modular, demo-enabled pipeline with robust parsing, data modeling, IR, and prompt design that can be integrated into existing RAG workflows.

Abstract

This paper introduces layout-aware graph modeling for multimodal RAG. Different from traditional RAG methods that mostly deal with flat text chunks, the proposed method takes into account the relationship of multimodalities by using a graph structure. To do that, a graph modeling structure is defined based on document layout parsing. The structure of an input document is retained with the connection of text chunks, tables, and figures. This representation allows the method to handle complex questions that require information from multimodalities. To confirm the efficiency of the graph modeling, a flexible RAG pipeline is developed using robust components. Experimental results on four benchmark test sets confirm the contribution of the layout-aware modeling for performance improvement of the RAG pipeline.

SuperRAG: Beyond RAG with Layout-Aware Graph Modeling

TL;DR

This work addresses the challenge of multimodal document understanding within Retrieval Augmented Generation by introducing Layout-Aware Graph Modeling (LAGM), a graph-based representation that preserves document layout and the relationships among text, tables, and diagrams. The SuperRAG framework combines LAGM with flexible retrieval strategies (LLM-driven graph traversal and heuristic TOC/table/diagram reasoning) and graph augmentation to enable accurate, scalable multimodal QA. Through DOCBENCH and SPIQA evaluations, SuperRAG demonstrates significant improvements over non-layout RAG and strong baselines, validating the value of layout-aware structure for IR and reasoning. The proposed system is practical for business use, offering a modular, demo-enabled pipeline with robust parsing, data modeling, IR, and prompt design that can be integrated into existing RAG workflows.

Abstract

This paper introduces layout-aware graph modeling for multimodal RAG. Different from traditional RAG methods that mostly deal with flat text chunks, the proposed method takes into account the relationship of multimodalities by using a graph structure. To do that, a graph modeling structure is defined based on document layout parsing. The structure of an input document is retained with the connection of text chunks, tables, and figures. This representation allows the method to handle complex questions that require information from multimodalities. To confirm the efficiency of the graph modeling, a flexible RAG pipeline is developed using robust components. Experimental results on four benchmark test sets confirm the contribution of the layout-aware modeling for performance improvement of the RAG pipeline.

Paper Structure

This paper contains 32 sections, 5 figures, 7 tables.

Figures (5)

  • Figure 1: The pipeline of the in-house parser.
  • Figure 2: The knowledge graph used for data modeling.
  • Figure 3: The proposed SuperRAG framework.
  • Figure 4: The proposed SuperRAG framework.
  • Figure 5: The demo system with the sample from the DOCBENCH dataset. The input question is "How many persons were convicted for money laundering offenses in Cyprus in 2018?" and the answer is "26 persons".