Identification of Regulatory Requirements Relevant to Business Processes: A Comparative Study on Generative AI, Embedding-based Ranking, Crowd and Expert-driven Methods
Catherine Sai, Shazia Sadiq, Lei Han, Gianluca Demartini, Stefanie Rinderle-Ma
TL;DR
The paper tackles the problem of identifying regulatory requirements relevant to business processes across diverse regulatory documents and jurisdictions. It compares four approaches—expert analysis, embedding-based NLP LIR, GPT-4, and crowdsourcing—using two real-world use cases and a gold standard created by domain experts. GPT-4 yields the highest recall among automated methods, while embedding-based ranking offers transparent pre-selection and crowdsourcing provides supplementary insights, with expert judgment remaining the gold standard. The study provides guidelines on method combinations based on process usage, impact, dynamics, and regulatory input, and releases public data to enable reproducibility.
Abstract
Organizations face the challenge of ensuring compliance with an increasing amount of requirements from various regulatory documents. Which requirements are relevant depends on aspects such as the geographic location of the organization, its domain, size, and business processes. Considering these contextual factors, as a first step, relevant documents (e.g., laws, regulations, directives, policies) are identified, followed by a more detailed analysis of which parts of the identified documents are relevant for which step of a given business process. Nowadays the identification of regulatory requirements relevant to business processes is mostly done manually by domain and legal experts, posing a tremendous effort on them, especially for a large number of regulatory documents which might frequently change. Hence, this work examines how legal and domain experts can be assisted in the assessment of relevant requirements. For this, we compare an embedding-based NLP ranking method, a generative AI method using GPT-4, and a crowdsourced method with the purely manual method of creating relevancy labels by experts. The proposed methods are evaluated based on two case studies: an Australian insurance case created with domain experts and a global banking use case, adapted from SAP Signavio's workflow example of an international guideline. A gold standard is created for both BPMN2.0 processes and matched to real-world textual requirements from multiple regulatory documents. The evaluation and discussion provide insights into strengths and weaknesses of each method regarding applicability, automation, transparency, and reproducibility and provide guidelines on which method combinations will maximize benefits for given characteristics such as process usage, impact, and dynamics of an application scenario.
