LLMCloudHunter: Harnessing LLMs for Automated Extraction of Detection Rules from Cloud-Based CTI
Yuval Schwartz, Lavi Benshimol, Dudu Mimran, Yuval Elovici, Asaf Shabtai
TL;DR
The paper introduces LLMCloudHunter, an end-to-end framework that leverages pretrained LLMs to automatically generate Sigma detection rules from unstructured, text- and image-based OSCTI data in cloud environments. It processes OSCTI through three phases—preprocessing, paragraph-level extraction/mapping, and OSCTI-level consolidation—to produce actionable Sigma rule candidates enriched with IoCs and MITRE ATT&CK mappings. The approach achieves high precision and recall for API calls and IoCs, with near-complete compatibility for Splunk queries, and demonstrates the value of visual data analysis and structured prompt design in threat intelligence workflows. By releasing both the annotated cloud OSCTI dataset and code, the work enables reproducibility and practical deployment in SIEM ecosystems, signaling a scalable path for cloud-specific threat detection automation.
Abstract
As the number and sophistication of cyber attacks have increased, threat hunting has become a critical aspect of active security, enabling proactive detection and mitigation of threats before they cause significant harm. Open-source cyber threat intelligence (OS-CTI) is a valuable resource for threat hunters, however, it often comes in unstructured formats that require further manual analysis. Previous studies aimed at automating OSCTI analysis are limited since (1) they failed to provide actionable outputs, (2) they did not take advantage of images present in OSCTI sources, and (3) they focused on on-premises environments, overlooking the growing importance of cloud environments. To address these gaps, we propose LLMCloudHunter, a novel framework that leverages large language models (LLMs) to automatically generate generic-signature detection rule candidates from textual and visual OSCTI data. We evaluated the quality of the rules generated by the proposed framework using 12 annotated real-world cloud threat reports. The results show that our framework achieved a precision of 92% and recall of 98% for the task of accurately extracting API calls made by the threat actor and a precision of 99% with a recall of 98% for IoCs. Additionally, 99.18% of the generated detection rule candidates were successfully compiled and converted into Splunk queries.
