Table of Contents
Fetching ...

False Alarms, Real Damage: Adversarial Attacks Using LLM-based Models on Text-based Cyber Threat Intelligence Systems

Samaneh Shafee, Alysson Bessani, Pedro M. Ferreira

TL;DR

The paper investigates vulnerabilities in text-based Cyber Threat Intelligence pipelines by analyzing evasion, flooding, and poisoning attacks that leverage LLM-generated fake text. It introduces an integrated CTI pipeline model and an attention-based FaN generation approach using ChatGPT to create cybersecurity-like yet non-security content, enabling realistic system-level testing. Experimental results demonstrate that evasion can trigger downstream flooding and poisoning, with classifiers showing near-random discrimination on adversarial inputs and poisoning progressively degrading performance over retraining rounds. The work further discusses defense directions, including fact-checking, source credibility, semantic validation, segment-level analysis, and explainable AI, to bolster automated CTI systems against evolving adversarial threats.

Abstract

Cyber Threat Intelligence (CTI) has emerged as a vital complementary approach that operates in the early phases of the cyber threat lifecycle. CTI involves collecting, processing, and analyzing threat data to provide a more accurate and rapid understanding of cyber threats. Due to the large volume of data, automation through Machine Learning (ML) and Natural Language Processing (NLP) models is essential for effective CTI extraction. These automated systems leverage Open Source Intelligence (OSINT) from sources like social networks, forums, and blogs to identify Indicators of Compromise (IoCs). Although prior research has focused on adversarial attacks on specific ML models, this study expands the scope by investigating vulnerabilities within various components of the entire CTI pipeline and their susceptibility to adversarial attacks. These vulnerabilities arise because they ingest textual inputs from various open sources, including real and potentially fake content. We analyse three types of attacks against CTI pipelines, including evasion, flooding, and poisoning, and assess their impact on the system's information selection capabilities. Specifically, on fake text generation, the work demonstrates how adversarial text generation techniques can create fake cybersecurity and cybersecurity-like text that misleads classifiers, degrades performance, and disrupts system functionality. The focus is primarily on the evasion attack, as it precedes and enables flooding and poisoning attacks within the CTI pipeline.

False Alarms, Real Damage: Adversarial Attacks Using LLM-based Models on Text-based Cyber Threat Intelligence Systems

TL;DR

The paper investigates vulnerabilities in text-based Cyber Threat Intelligence pipelines by analyzing evasion, flooding, and poisoning attacks that leverage LLM-generated fake text. It introduces an integrated CTI pipeline model and an attention-based FaN generation approach using ChatGPT to create cybersecurity-like yet non-security content, enabling realistic system-level testing. Experimental results demonstrate that evasion can trigger downstream flooding and poisoning, with classifiers showing near-random discrimination on adversarial inputs and poisoning progressively degrading performance over retraining rounds. The work further discusses defense directions, including fact-checking, source credibility, semantic validation, segment-level analysis, and explainable AI, to bolster automated CTI systems against evolving adversarial threats.

Abstract

Cyber Threat Intelligence (CTI) has emerged as a vital complementary approach that operates in the early phases of the cyber threat lifecycle. CTI involves collecting, processing, and analyzing threat data to provide a more accurate and rapid understanding of cyber threats. Due to the large volume of data, automation through Machine Learning (ML) and Natural Language Processing (NLP) models is essential for effective CTI extraction. These automated systems leverage Open Source Intelligence (OSINT) from sources like social networks, forums, and blogs to identify Indicators of Compromise (IoCs). Although prior research has focused on adversarial attacks on specific ML models, this study expands the scope by investigating vulnerabilities within various components of the entire CTI pipeline and their susceptibility to adversarial attacks. These vulnerabilities arise because they ingest textual inputs from various open sources, including real and potentially fake content. We analyse three types of attacks against CTI pipelines, including evasion, flooding, and poisoning, and assess their impact on the system's information selection capabilities. Specifically, on fake text generation, the work demonstrates how adversarial text generation techniques can create fake cybersecurity and cybersecurity-like text that misleads classifiers, degrades performance, and disrupts system functionality. The focus is primarily on the evasion attack, as it precedes and enables flooding and poisoning attacks within the CTI pipeline.

Paper Structure

This paper contains 28 sections, 5 equations, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Taxonomy of input text in a CTI Pipeline.
  • Figure 2: Proposed integrated CTI extraction pipeline.
  • Figure 3: Illustration of generating adversarial text using the attention mechanism and ChatGPT-4o. The generated adversarial text is input to a pre-trained binary classifier to mislead its predictions. A: The color intensity indicates the importance of words, ranging from less important (light) to highly important (dark). The top three most important tokens in each tweet were incorporated into the prompt.
  • Figure 4: Flooding attack workflow.
  • Figure 5: Distribution of semantic similarity scores within the dataset of tweets.
  • ...and 5 more figures