Building Specialized Software-Assistant ChatBot with Graph-Based Retrieval-Augmented Generation
Mohammed Hilel, Yannis Karmim, Jean De Bodinat, Reda Sarehane, Antoine Gillon
TL;DR
This work presents a Graph-RAG framework that grounds large language models in enterprise software by automatically converting web interfaces into state–action graphs and using PCST-based subgraph retrieval to provide context to the LLM. The approach enables precise, grounded, and multilingual software guidance without fine-tuning, addressing hallucinations and deployment constraints typical of black-box LLMs. An industrial adaptation is demonstrated with a software-to-graph pipeline, graph-based retrieval, and production-friendly deployment within LemonAI, including qualitative evaluations on Salesforce and Dolibarr that show improved instruction detail and contextual accuracy. The results highlight practical considerations for scaling, robustness, and integration in production DAP workflows, with a roadmap toward broader deployment and potential automation of software actions.
Abstract
Digital Adoption Platforms (DAPs) have become essential tools for helping employees navigate complex enterprise software such as CRM, ERP, or HRMS systems. Companies like LemonLearning have shown how digital guidance can reduce training costs and accelerate onboarding. However, building and maintaining these interactive guides still requires extensive manual effort. Leveraging Large Language Models as virtual assistants is an appealing alternative, yet without a structured understanding of the target software, LLMs often hallucinate and produce unreliable answers. Moreover, most production-grade LLMs are black-box APIs, making fine-tuning impractical due to the lack of access to model weights. In this work, we introduce a Graph-based Retrieval-Augmented Generation framework that automatically converts enterprise web applications into state-action knowledge graphs, enabling LLMs to generate grounded and context-aware assistance. The framework was co-developed with the AI enterprise RAKAM, in collaboration with Lemon Learning. We detail the engineering pipeline that extracts and structures software interfaces, the design of the graph-based retrieval process, and the integration of our approach into production DAP workflows. Finally, we discuss scalability, robustness, and deployment lessons learned from industrial use cases.
