Graphy'our Data: Towards End-to-End Modeling, Exploring and Generating Report from Raw Data
Longbin Lai, Changwei Luo, Yunkai Lou, Mingchen Ju, Zhengyi Yang
TL;DR
Graphy tackles Progressive Document Investigation by introducing an offline Scrapper that builds a Fact-Dimension graph from unstructured documents and an online Surveyor that enables iterative graph exploration and LLM-driven report generation. It leverages a property graph model with Fact nodes representing papers and Dimension nodes capturing extracted content such as abstracts, challenges, and solutions, supporting scalable, structured literature surveys. The approach is validated on a large pre-scrapped graph (tens of thousands of papers, hundreds of thousands of dimensions, and numerous references) and includes an end-to-end workflow from exploration to mind-map generation and formatted report output (PDF/TeX), with open-source code and data. The framework potentially generalizes to finance and other domains through its Navigation-driven linking and generation capabilities, enabling reproducible, overseen, multi-step investigations in complex corpora.
Abstract
Large Language Models (LLMs) have recently demonstrated remarkable performance in tasks such as Retrieval-Augmented Generation (RAG) and autonomous AI agent workflows. Yet, when faced with large sets of unstructured documents requiring progressive exploration, analysis, and synthesis, such as conducting literature survey, existing approaches often fall short. We address this challenge -- termed Progressive Document Investigation -- by introducing Graphy, an end-to-end platform that automates data modeling, exploration and high-quality report generation in a user-friendly manner. Graphy comprises an offline Scrapper that transforms raw documents into a structured graph of Fact and Dimension nodes, and an online Surveyor that enables iterative exploration and LLM-driven report generation. We showcase a pre-scrapped graph of over 50,000 papers -- complete with their references -- demonstrating how Graphy facilitates the literature-survey scenario. The demonstration video can be found at https://youtu.be/uM4nzkAdGlM.
