Table of Contents
Fetching ...

Hypersparse Traffic Matrices from Suricata Network Flows using GraphBLAS

Michael Houle, Michael Jones, Dan Wallmeyer, Risa Brodeur, Justin Burr, Hayden Jananthan, Sam Merrell, Peter Michaleas, Anthony Perez, Andrew Prout, Jeremy Kepner

TL;DR

Addresses the challenge of scalable analysis of hypersparse network traffic by constructing traffic matrices from Suricata flow records with GraphBLAS. The approach uses a json2grb pipeline to convert flow JSON into directed edge counts within a ${2^{32}}\times{2^{32}}$ address space, with anonymization performed via CryptoPAN and batching windows of $2^{17}$ packets. Key contributions include a low-memory ingestion workflow, TAR-archived batches, and LZ4-compressed storage of hypersparse matrices, with GraphBLAS routines handling the heavy lifting efficiently. The results show high throughput and modest memory/storage requirements, supporting practical deployment of privacy-preserving, sensor-based traffic analysis.

Abstract

Hypersparse traffic matrices constructed from network packet source and destination addresses is a powerful tool for gaining insights into network traffic. SuiteSparse: GraphBLAS, an open source package or building, manipulating, and analyzing large hypersparse matrices, is one approach to constructing these traffic matrices. Suricata is a widely used open source network intrusion detection software package. This work demonstrates how Suricata network flow records can be used to efficiently construct hypersparse matrices using GraphBLAS.

Hypersparse Traffic Matrices from Suricata Network Flows using GraphBLAS

TL;DR

Addresses the challenge of scalable analysis of hypersparse network traffic by constructing traffic matrices from Suricata flow records with GraphBLAS. The approach uses a json2grb pipeline to convert flow JSON into directed edge counts within a address space, with anonymization performed via CryptoPAN and batching windows of packets. Key contributions include a low-memory ingestion workflow, TAR-archived batches, and LZ4-compressed storage of hypersparse matrices, with GraphBLAS routines handling the heavy lifting efficiently. The results show high throughput and modest memory/storage requirements, supporting practical deployment of privacy-preserving, sensor-based traffic analysis.

Abstract

Hypersparse traffic matrices constructed from network packet source and destination addresses is a powerful tool for gaining insights into network traffic. SuiteSparse: GraphBLAS, an open source package or building, manipulating, and analyzing large hypersparse matrices, is one approach to constructing these traffic matrices. Suricata is a widely used open source network intrusion detection software package. This work demonstrates how Suricata network flow records can be used to efficiently construct hypersparse matrices using GraphBLAS.
Paper Structure (4 sections, 1 figure)

This paper contains 4 sections, 1 figure.

Figures (1)

  • Figure 1: Suricata Traffic Flow Processing. A PCAP file containing CAIDA network telescope data is read in by the Suricata IDS in offline replay mode, and Suricata is configured to emit 'flow' records as JSON via an EVE logging target. This JSON log file is then parsed by our program which converts Suricata flow records into anonymized GraphBLAS traffic matrices. $^1$Assumes 100 packets per flow for typical real-world traffic jurkiewicz2021flow.