Table of Contents
Fetching ...

Cross-Format Retrieval-Augmented Generation in XR with LLMs for Context-Aware Maintenance Assistance

Akos Nagy, Yannis Spyridis, Vasileios Argyriou

TL;DR

The paper tackles context-aware maintenance support in XR by deploying a cross-format Retrieval-Augmented Generation (RAG) system that retrieves and generates instructions from heterogeneous data sources (text, PDFs, CSVs). It introduces a Complete Multi-Modal Cross-Format RAG Architecture with multi-path retrieval and a JSON-formatted knowledge base summary, enabling efficient grounding of LLM outputs to a maintenance knowledge base parsed from PDFs via pypdf and CSVs via DuckDB. An extensive evaluation across eight LLMs (four OpenAI, four Llama) and three maintenance scenarios uses BLEU and METEOR to measure accuracy, plus speed and qualitative error analysis, revealing that GPT-4 and GPT-4o-mini generally outperform alternatives, especially on complex, cross-format queries. The results demonstrate the practical potential of cross-format RAG in XR maintenance workflows while highlighting challenges such as false positives and model-specific verbosity, pointing to future work on retrieval refinement, context filtering, and real-world deployment considerations.

Abstract

This paper presents a detailed evaluation of a Retrieval-Augmented Generation (RAG) system that integrates large language models (LLMs) to enhance information retrieval and instruction generation for maintenance personnel across diverse data formats. We assessed the performance of eight LLMs, emphasizing key metrics such as response speed and accuracy, which were quantified using BLEU and METEOR scores. Our findings reveal that advanced models like GPT-4 and GPT-4o-mini significantly outperform their counterparts, particularly when addressing complex queries requiring multi-format data integration. The results validate the system's ability to deliver timely and accurate responses, highlighting the potential of RAG frameworks to optimize maintenance operations. Future research will focus on refining retrieval techniques for these models and enhancing response generation, particularly for intricate scenarios, ultimately improving the system's practical applicability in dynamic real-world environments.

Cross-Format Retrieval-Augmented Generation in XR with LLMs for Context-Aware Maintenance Assistance

TL;DR

The paper tackles context-aware maintenance support in XR by deploying a cross-format Retrieval-Augmented Generation (RAG) system that retrieves and generates instructions from heterogeneous data sources (text, PDFs, CSVs). It introduces a Complete Multi-Modal Cross-Format RAG Architecture with multi-path retrieval and a JSON-formatted knowledge base summary, enabling efficient grounding of LLM outputs to a maintenance knowledge base parsed from PDFs via pypdf and CSVs via DuckDB. An extensive evaluation across eight LLMs (four OpenAI, four Llama) and three maintenance scenarios uses BLEU and METEOR to measure accuracy, plus speed and qualitative error analysis, revealing that GPT-4 and GPT-4o-mini generally outperform alternatives, especially on complex, cross-format queries. The results demonstrate the practical potential of cross-format RAG in XR maintenance workflows while highlighting challenges such as false positives and model-specific verbosity, pointing to future work on retrieval refinement, context filtering, and real-world deployment considerations.

Abstract

This paper presents a detailed evaluation of a Retrieval-Augmented Generation (RAG) system that integrates large language models (LLMs) to enhance information retrieval and instruction generation for maintenance personnel across diverse data formats. We assessed the performance of eight LLMs, emphasizing key metrics such as response speed and accuracy, which were quantified using BLEU and METEOR scores. Our findings reveal that advanced models like GPT-4 and GPT-4o-mini significantly outperform their counterparts, particularly when addressing complex queries requiring multi-format data integration. The results validate the system's ability to deliver timely and accurate responses, highlighting the potential of RAG frameworks to optimize maintenance operations. Future research will focus on refining retrieval techniques for these models and enhancing response generation, particularly for intricate scenarios, ultimately improving the system's practical applicability in dynamic real-world environments.

Paper Structure

This paper contains 9 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Demonstration of Maintenance Application, showing A) Maintenance Personnel using XR application via OHMD while being monitored by Support Personnel using a computer application. The Capabilities of the XR application allow B) textual and C) visual instructions.
  • Figure 2: Complete Multi-Modal Cross-Format RAG Architecture for Maintenance Procedure Support
  • Figure 3: RAG Architecture for Cross-Format Data Retrieval
  • Figure 4: Percentage of outcomes of the data generation process per model per scenario