Table of Contents
Fetching ...

A dataset for cyber threat intelligence modeling of connected autonomous vehicles

Yinghui Wang, Yilong Ren, Hongmao Qin, Zhiyong Cui, Yanan Zhao, Haiyang Yu

TL;DR

This work presents a novel corpus specifically designed for vehicle cybersecurity knowledge mining, which comprises 908 real automotive cybersecurity reports, 8195 security entities and 4852 semantic relations and conducts a comprehensive analysis of CTI knowledge mining algorithms based on this corpus.

Abstract

Cyber attacks have become a vital threat to connected autonomous vehicles in intelligent transportation systems. Cyber threat intelligence, as the collection of cyber threat information, provides an ideal approach for responding to emerging vehicle cyber threats and enabling proactive security defense. Obtaining valuable information from enormous cybersecurity data using knowledge extraction technologies to achieve cyber threat intelligence modeling is an effective means to ensure automotive cybersecurity. Unfortunately, there is no existing cybersecurity dataset available for cyber threat intelligence modeling research in the automotive field. This paper reports the creation of a cyber threat intelligence corpus focusing on vehicle cybersecurity knowledge mining. This dataset, annotated using a joint labeling strategy, comprises 908 real automotive cybersecurity reports, containing 3678 sentences, 8195 security entities and 4852 semantic relations. We further conduct a comprehensive analysis of cyber threat intelligence mining algorithms based on this corpus. The proposed dataset will serve as a valuable resource for evaluating the performance of existing algorithms and advancing research in cyber threat intelligence modeling within the automotive field.

A dataset for cyber threat intelligence modeling of connected autonomous vehicles

TL;DR

This work presents a novel corpus specifically designed for vehicle cybersecurity knowledge mining, which comprises 908 real automotive cybersecurity reports, 8195 security entities and 4852 semantic relations and conducts a comprehensive analysis of CTI knowledge mining algorithms based on this corpus.

Abstract

Cyber attacks have become a vital threat to connected autonomous vehicles in intelligent transportation systems. Cyber threat intelligence, as the collection of cyber threat information, provides an ideal approach for responding to emerging vehicle cyber threats and enabling proactive security defense. Obtaining valuable information from enormous cybersecurity data using knowledge extraction technologies to achieve cyber threat intelligence modeling is an effective means to ensure automotive cybersecurity. Unfortunately, there is no existing cybersecurity dataset available for cyber threat intelligence modeling research in the automotive field. This paper reports the creation of a cyber threat intelligence corpus focusing on vehicle cybersecurity knowledge mining. This dataset, annotated using a joint labeling strategy, comprises 908 real automotive cybersecurity reports, containing 3678 sentences, 8195 security entities and 4852 semantic relations. We further conduct a comprehensive analysis of cyber threat intelligence mining algorithms based on this corpus. The proposed dataset will serve as a valuable resource for evaluating the performance of existing algorithms and advancing research in cyber threat intelligence modeling within the automotive field.

Paper Structure

This paper contains 17 sections, 3 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Automotive CTI Ontology Model.
  • Figure 2: Brat manual annotation.
  • Figure 3: Automotive CTI annotation data.
  • Figure 4: BERT-BiLSTM-att-CRF model.
  • Figure 5: BiLSTM-dynamic-att-LSTM model.
  • ...and 1 more figures