Ruling the Unruly: Designing Effective, Low-Noise Network Intrusion Detection Rules for Security Operations Centers
Koen T. W. Teuwen, Tom Mulders, Emmanuele Zambon, Luca Allodi
TL;DR
This study analyzes NIDS rules used in a commercial SOC to understand factors driving rule quality and analyst workload. By combining 30M alerts, 290k rule revisions, and 42 incidents across two organizations with expert interviews, the authors derive six rule design principles and quantify their impact on rule specificity via regression analyses. Key findings include that using proxies for detection, lacking alert throttling, and not distinguishing between successful and unsuccessful actions increase workload, while throttling can dramatically reduce alerts with minimal impact on coverage. The work demonstrates that these principles can be operationalized in real SOC settings, offering practical guidelines for designing low noise, high-coverage NIDS rules and a tool to identify deviations from the principles across rule sets.
Abstract
Many Security Operations Centers (SOCs) today still heavily rely on signature-based Network Intrusion Detection Systems (NIDS) such as Suricata. The specificity of intrusion detection rules and the coverage provided by rulesets are common concerns within the professional community surrounding SOCs, which impact the effectiveness of automated alert post-processing approaches. We postulate a better understanding of factors influencing the quality of rules can help address current SOC issues. In this paper, we characterize the rules in use at a collaborating commercial (managed) SOC serving customers in sectors including education and IT management. During this process, we discover six relevant design principles, which we consolidate through interviews with experienced rule designers at the SOC.We then validate our design principles by quantitatively assessing their effect on rule specificity. We find that several of these design considerations significantly impact unnecessary workload caused by rules. For instance, rules that leverage proxies for detection, and rules that do not employ alert throttling or do not distinguish (un)successful malicious actions, cause significantly more workload for SOC analysts. Moreover, rules that match a generalized characteristic to detect malicious behavior, which is believed to increase coverage, also significantly increase workload, suggesting a tradeoff must be struck between rule specificity and coverage. We show that these design principles can be applied successfully at a SOC to reduce workload whilst maintaining coverage despite the prevalence of violations of the principles.
