Table of Contents
Fetching ...

Using LLMs for Automated Privacy Policy Analysis: Prompt Engineering, Fine-Tuning and Explainability

Yuxin Chen, Peng Tang, Weidong Qiu, Shujun Li

TL;DR

This paper investigates automated privacy policy analysis using large language models (LLMs) by combining prompt engineering and LoRA fine-tuning across four corpora with hierarchical taxonomies. It demonstrates that the hybrid approach achieves state-of-the-art performance for privacy policy concept classification and provides high-quality explainability, measured by completeness, logicality, and comprehensibility (averaging over $>91.1\%$ in human evaluations). The work offers a practical pathway for accurate, explainable privacy policy analysis and lays groundwork for downstream tasks such as reader-friendly summaries and regulatory compliance checks. Limitations include prompt-based performance gaps, resource constraints, and the need for ongoing exploration of continual pre-training and larger models, which the authors intend to address in future work.

Abstract

Privacy policies are widely used by digital services and often required for legal purposes. Many machine learning based classifiers have been developed to automate detection of different concepts in a given privacy policy, which can help facilitate other automated tasks such as producing a more reader-friendly summary and detecting legal compliance issues. Despite the successful applications of large language models (LLMs) to many NLP tasks in various domains, there is very little work studying the use of LLMs for automated privacy policy analysis, therefore, if and how LLMs can help automate privacy policy analysis remains under-explored. To fill this research gap, we conducted a comprehensive evaluation of LLM-based privacy policy concept classifiers, employing both prompt engineering and LoRA (low-rank adaptation) fine-tuning, on four state-of-the-art (SOTA) privacy policy corpora and taxonomies. Our experimental results demonstrated that combining prompt engineering and fine-tuning can make LLM-based classifiers outperform other SOTA methods, \emph{significantly} and \emph{consistently} across privacy policy corpora/taxonomies and concepts. Furthermore, we evaluated the explainability of the LLM-based classifiers using three metrics: completeness, logicality, and comprehensibility. For all three metrics, a score exceeding 91.1\% was observed in our evaluation, indicating that LLMs are not only useful to improve the classification performance, but also to enhance the explainability of detection results.

Using LLMs for Automated Privacy Policy Analysis: Prompt Engineering, Fine-Tuning and Explainability

TL;DR

This paper investigates automated privacy policy analysis using large language models (LLMs) by combining prompt engineering and LoRA fine-tuning across four corpora with hierarchical taxonomies. It demonstrates that the hybrid approach achieves state-of-the-art performance for privacy policy concept classification and provides high-quality explainability, measured by completeness, logicality, and comprehensibility (averaging over in human evaluations). The work offers a practical pathway for accurate, explainable privacy policy analysis and lays groundwork for downstream tasks such as reader-friendly summaries and regulatory compliance checks. Limitations include prompt-based performance gaps, resource constraints, and the need for ongoing exploration of continual pre-training and larger models, which the authors intend to address in future work.

Abstract

Privacy policies are widely used by digital services and often required for legal purposes. Many machine learning based classifiers have been developed to automate detection of different concepts in a given privacy policy, which can help facilitate other automated tasks such as producing a more reader-friendly summary and detecting legal compliance issues. Despite the successful applications of large language models (LLMs) to many NLP tasks in various domains, there is very little work studying the use of LLMs for automated privacy policy analysis, therefore, if and how LLMs can help automate privacy policy analysis remains under-explored. To fill this research gap, we conducted a comprehensive evaluation of LLM-based privacy policy concept classifiers, employing both prompt engineering and LoRA (low-rank adaptation) fine-tuning, on four state-of-the-art (SOTA) privacy policy corpora and taxonomies. Our experimental results demonstrated that combining prompt engineering and fine-tuning can make LLM-based classifiers outperform other SOTA methods, \emph{significantly} and \emph{consistently} across privacy policy corpora/taxonomies and concepts. Furthermore, we evaluated the explainability of the LLM-based classifiers using three metrics: completeness, logicality, and comprehensibility. For all three metrics, a score exceeding 91.1\% was observed in our evaluation, indicating that LLMs are not only useful to improve the classification performance, but also to enhance the explainability of detection results.

Paper Structure

This paper contains 22 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: A partial hierarchy of GoPPC-150.
  • Figure 2: Two-leveled fine-tuning process.
  • Figure 3: Effect of the model size on the performance. Standards s1-s6 represent OPP-115, GoPPC-150 level-1 nodes, GoPPC-150 all nodes, CAPP-130, APPCP-100 level-1 node, and APPCP-100 all nodes, respectively.
  • Figure 4: Examples of the five types of prompts used in our experiments. The content within the yellow dashed box is not included in Prompt 5.