Secu-Table: a Comprehensive security table dataset for evaluating semantic table interpretation systems
Azanzi Jiomekong, Jean Bikim, Patricia Negoue, Joyce Chin
TL;DR
Secu-Table introduces a comprehensive, domain-specific security table dataset for evaluating semantic table interpretation (STI) systems, especially those leveraging LLMs. It combines CVE and CWE data with annotations from Wikidata and the SEPSES CSKG to create over 1,500 tables (v2) containing hundreds of thousands of entities and rows, suitable for SemTab-style table-to-knowledge-graph matching. The construction pipeline encompasses data sources, KG linking, two-tier curation, and manual annotation (CEA/CTA/CPA), with deliberate error injection to mimic real-world data quality. The authors provide open access to code and data (MIT-licensed) and present baseline evaluations using open and closed LLMs, establishing a reproducible benchmark for security-domain STI. Quarterly releases and plans to expand data sources (e.g., ATT&CK, CCE) aim to improve coverage and realism for downstream security analytics applications.
Abstract
Evaluating semantic tables interpretation (STI) systems, (particularly, those based on Large Language Models- LLMs) especially in domain-specific contexts such as the security domain, depends heavily on the dataset. However, in the security domain, tabular datasets for state-of-the-art are not publicly available. In this paper, we introduce Secu-Table dataset, composed of more than 1500 tables with more than 15k entities constructed using security data extracted from Common Vulnerabilities and Exposures (CVE) and Common Weakness Enumeration (CWE) data sources and annotated using Wikidata and the SEmantic Processing of Security Event Streams CyberSecurity Knowledge Graph (SEPSES CSKG). Along with the dataset, all the code is publicly released. This dataset is made available to the research community in the context of the SemTab challenge on Tabular to Knowledge Graph Matching. This challenge aims to evaluate the performance of several STI based on open source LLMs. Preliminary evaluation, serving as baseline, was conducted using Falcon3-7b-instruct and Mistral-7B-Instruct, two open source LLMs and GPT-4o mini one closed source LLM.
