A Method for Quantifying Human Risk and a Blueprint for LLM Integration
Giuseppe Canale
TL;DR
The paper addresses the problem of quantifying human-centric risk in cybersecurity by introducing the Cybersecurity Psychology Framework (CPF), an end-to-end approach that integrates established psychological constructs with SOC telemetry. It defines 100 indicators and algorithmic measurements for vulnerabilities such as Compliance Fatigue, Alert Overload Bias, and Risk Perception Gaps, and proposes a privacy-preserving, on-premise Retrieval-Augmented Generation LLM architecture to analyze data and generate actionable insights. A rigorous validation plan is outlined, including a completed synthetic-data Phase 1 achieving a $0.92$ F1-score and phased steps toward retrospective and prospective validation with industry partnerships, acknowledging data-access challenges. The framework emphasizes ethical and privacy safeguards, data minimization, governance, and transparency to build trust, aiming to provide a practical, scalable tool for reducing human-factor-driven breaches and improving SOC effectiveness. Collectively, CPF offers a unified, automatable pathway from psychological theory to operational risk mitigation with potential for broad industry impact and iterative refinement through partnerships.
Abstract
This paper presents the Cybersecurity Psychology Framework (CPF), a novel methodology for quantifying human-centric vulnerabilities in security operations through systematic integration of established psychological constructs with operational security telemetry. While individual human factors-alert fatigue, compliance fatigue, cognitive overload, and risk perception biases-have been extensively studied in isolation, no framework provides end-to-end operationalization across the full spectrum of psychological vulnerabilities. We address this gap by: (1) defining specific, measurable algorithms that quantify key psychological states using standard SOC tooling (SIEM, ticketing systems, communication platforms); (2) proposing a lightweight, privacy-preserving LLM architecture based on Retrieval-Augmented Generation (RAG) and domain-specific fine-tuning to analyze structured and unstructured data for latent psychological risks; (3) detailing a rigorous mixed-methods validation strategy acknowledging the inherent difficulty of obtaining sensitive cybersecurity data. Our implementation of CPF indicators has been demonstrated in a proof-of-concept deployment using small language models achieving 0.92 F1-score on synthetic data. This work provides the theoretical and methodological foundation necessary for industry partnerships to conduct empirical validation with real operational data.
