Table of Contents
Fetching ...

RAGent: Retrieval-based Access Control Policy Generation

Sakuna Harinda Jayasundara, Nalin Asanka Gamagedara Arachchilage, Giovanni Russello

TL;DR

This work proposes RAGent, a novel retrieval-based access control policy generation framework based on language models that automatically verifies the generated policies and iteratively refines them through a novel verification-refinement mechanism, improving the reliability of the process by 3%, reaching the F1 score of 80.6%.

Abstract

Manually generating access control policies from an organization's high-level requirement specifications poses significant challenges. It requires laborious efforts to sift through multiple documents containing such specifications and translate their access requirements into access control policies. Also, the complexities and ambiguities of these specifications often result in errors by system administrators during the translation process, leading to data breaches. However, the automated policy generation frameworks designed to help administrators in this process are unreliable due to limitations, such as the lack of domain adaptation. Therefore, to improve the reliability of access control policy generation, we propose RAGent, a novel retrieval-based access control policy generation framework based on language models. RAGent identifies access requirements from high-level requirement specifications with an average state-of-the-art F1 score of 87.9%. Through retrieval augmented generation, RAGent then translates the identified access requirements into access control policies with an F1 score of 77.9%. Unlike existing frameworks, RAGent generates policies with complex components like purposes and conditions, in addition to subjects, actions, and resources. Moreover, RAGent automatically verifies the generated policies and iteratively refines them through a novel verification-refinement mechanism, further improving the reliability of the process by 3%, reaching the F1 score of 80.6%. We also introduce three annotated datasets for developing access control policy generation frameworks in the future, addressing the data scarcity of the domain.

RAGent: Retrieval-based Access Control Policy Generation

TL;DR

This work proposes RAGent, a novel retrieval-based access control policy generation framework based on language models that automatically verifies the generated policies and iteratively refines them through a novel verification-refinement mechanism, improving the reliability of the process by 3%, reaching the F1 score of 80.6%.

Abstract

Manually generating access control policies from an organization's high-level requirement specifications poses significant challenges. It requires laborious efforts to sift through multiple documents containing such specifications and translate their access requirements into access control policies. Also, the complexities and ambiguities of these specifications often result in errors by system administrators during the translation process, leading to data breaches. However, the automated policy generation frameworks designed to help administrators in this process are unreliable due to limitations, such as the lack of domain adaptation. Therefore, to improve the reliability of access control policy generation, we propose RAGent, a novel retrieval-based access control policy generation framework based on language models. RAGent identifies access requirements from high-level requirement specifications with an average state-of-the-art F1 score of 87.9%. Through retrieval augmented generation, RAGent then translates the identified access requirements into access control policies with an F1 score of 77.9%. Unlike existing frameworks, RAGent generates policies with complex components like purposes and conditions, in addition to subjects, actions, and resources. Moreover, RAGent automatically verifies the generated policies and iteratively refines them through a novel verification-refinement mechanism, further improving the reliability of the process by 3%, reaching the F1 score of 80.6%. We also introduce three annotated datasets for developing access control policy generation frameworks in the future, addressing the data scarcity of the domain.
Paper Structure (30 sections, 4 figures, 6 tables)

This paper contains 30 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: High-level architecture of RAGent. Step 1: Pre-processing. Step 2: NLACP identification. Step 3: Retrieving information relevant to generating the access control policy of the NLACP. Step 4: Generating access control policy using the retrieved information. Step 4.1: Post-processing the generated policy. Step 5: Access control policy verification that allows the correctly generated policies to apply to the system and provides feedback if the generated policy is found incorrect. Step 6: Iteratively refine the generated policy using the verification feedback.
  • Figure 2: Structured representation of the NLACP "The doctor can write prescriptions, but the nurse cannot.".
  • Figure 3: Effect of utilizing organization-specific information to generate access control policies (Section \ref{['subsec:approach/generation']}) and to post-process generated policies (Section \ref{['subsubsec:approach/post_processing']}).
  • Figure 4: Effect of iteratively refining the generated policies based on the verification result.