Affordable AI Assistants with Knowledge Graph of Thoughts
Maciej Besta, Lorenzo Paleari, Jia Hao Andrea Jiang, Robert Gerstenberger, You Wu, Jón Gunnar Hannesson, Patrick Iff, Ales Kubicek, Piotr Nyczyk, Diana Khimey, Nils Blach, Haiqiang Zhang, Tao Zhang, Peiran Ma, Grzegorz Kwaśniewski, Marcin Copik, Hubert Niewiadomski, Torsten Hoefler
TL;DR
This work tackles the high cost and limited success of large LLM-driven agents by introducing Knowledge Graph of Thoughts (KGoT), a modular AI assistant architecture that constructs and evolves task-specific knowledge graphs to guide reasoning. KGoT combines a dual-LLM controller with a Graph Store supporting multiple KG representations (property, RDF, adjacency list) and a versatile tool suite, enabling iterative knowledge acquisition and structured query-based or script-based retrieval. Empirical results on GAIA and SimpleQA show that KGoT achieves higher task success rates while substantially reducing costs (e.g., up to 36x cheaper than GPT-4o without sacrificing performance), and that externalizing reasoning into a KG improves transparency, bias mitigation, and robustness via Self-Consistency and LLM-as-a-Judge. The approach demonstrates strong scalability through asynchronous execution and MPI-based distribution, and lays groundwork for applying KG-based reasoning to diverse, complex domains with external compute workflows.
Abstract
Large Language Models (LLMs) are revolutionizing the development of AI assistants capable of performing diverse tasks across domains. However, current state-of-the-art LLM-driven agents face significant challenges, including high operational costs and limited success rates on complex benchmarks like GAIA. To address these issues, we propose Knowledge Graph of Thoughts (KGoT), an innovative AI assistant architecture that integrates LLM reasoning with dynamically constructed knowledge graphs (KGs). KGoT extracts and structures task-relevant knowledge into a dynamic KG representation, iteratively enhanced through external tools such as math solvers, web crawlers, and Python scripts. Such structured representation of task-relevant knowledge enables low-cost models to solve complex tasks effectively while also minimizing bias and noise. For example, KGoT achieves a 29% improvement in task success rates on the GAIA benchmark compared to Hugging Face Agents with GPT-4o mini. Moreover, harnessing a smaller model dramatically reduces operational costs by over 36x compared to GPT-4o. Improvements for other models (e.g., Qwen2.5-32B and Deepseek-R1-70B) and benchmarks (e.g., SimpleQA) are similar. KGoT offers a scalable, affordable, versatile, and high-performing solution for AI assistants.
