Knowledge Sharing in Manufacturing using Large Language Models: User Evaluation and Model Benchmarking
Samuel Kernan Freire, Chaofan Wang, Mina Foosherian, Stefan Wellsandt, Santiago Ruiz-Arenas, Evangelos Niforatos
TL;DR
The paper investigates knowledge sharing in manufacturing by deploying an LLM-based retrieval system that harnesses factory manuals and issue reports through Retrieval Augmented Generation. It assesses usability, adoption, and operational impact via a field user study, and benchmarks multiple LLMs to compare factuality, completeness, and hallucinations, finding GPT-4 to perform best, with open-source models offering strong proximity and advantages in privacy and customization. The work demonstrates the feasibility of domain-specific LLM deployments in factories, identifies user-centered benefits and constraints, and highlights practical trade-offs between model performance and human expertise. Overall, the study provides a concrete system design and preliminary evidence that LLMs can modernize knowledge management in manufacturing, while outlining avenues for longitudinal field studies and model/prompt customization to achieve reliable, scalable adoption.
Abstract
Recent advances in natural language processing enable more intelligent ways to support knowledge sharing in factories. In manufacturing, operating production lines has become increasingly knowledge-intensive, putting strain on a factory's capacity to train and support new operators. This paper introduces a Large Language Model (LLM)-based system designed to retrieve information from the extensive knowledge contained in factory documentation and knowledge shared by expert operators. The system aims to efficiently answer queries from operators and facilitate the sharing of new knowledge. We conducted a user study at a factory to assess its potential impact and adoption, eliciting several perceived benefits, namely, enabling quicker information retrieval and more efficient resolution of issues. However, the study also highlighted a preference for learning from a human expert when such an option is available. Furthermore, we benchmarked several commercial and open-sourced LLMs for this system. The current state-of-the-art model, GPT-4, consistently outperformed its counterparts, with open-source models trailing closely, presenting an attractive option given their data privacy and customization benefits. In summary, this work offers preliminary insights and a system design for factories considering using LLM tools for knowledge management.
