KRAIL: A Knowledge-Driven Framework for Base Human Reliability Analysis Integrating IDHEAS and Large Language Models
Xingyu Xiao, Peng Chen, Ben Qi, Hongru Zhao, Jingang Liang, Jiejuan Tong, Haitao Wang
TL;DR
KRAIL tackles the bottleneck of base HEP estimation in human reliability analysis by fusing IDHEAS-DATA with large language models and retrieval-augmented generation. The authors propose a two-stage framework: a multi-agent task-decomposition stage and a knowledge-graph–driven integration stage, leveraging Neo4j and few-shot prompts to extract PIF, CFM, and related attributes before computing the base HEP. Empirical results on IDHEAS-derived data show that 5-shot configurations yield the best accuracy, with substantial time savings versus manual methods and robust ablation evidence for the value of the multi-agent design. The approach offers a scalable, semi-automated pathway to reliable HEP estimation under partial information, with potential for broader knowledge integration and domain expansion in safety-critical industries.
Abstract
Human reliability analysis (HRA) is crucial for evaluating and improving the safety of complex systems. Recent efforts have focused on estimating human error probability (HEP), but existing methods often rely heavily on expert knowledge,which can be subjective and time-consuming. Inspired by the success of large language models (LLMs) in natural language processing, this paper introduces a novel two-stage framework for knowledge-driven reliability analysis, integrating IDHEAS and LLMs (KRAIL). This innovative framework enables the semi-automated computation of base HEP values. Additionally, knowledge graphs are utilized as a form of retrieval-augmented generation (RAG) for enhancing the framework' s capability to retrieve and process relevant data efficiently. Experiments are systematically conducted and evaluated on authoritative datasets of human reliability. The experimental results of the proposed methodology demonstrate its superior performance on base HEP estimation under partial information for reliability assessment.
