Sharpening Kubernetes Audit Logs with Context Awareness

Matteo Franzil; Valentino Armani; Luis Augusto Dias Knob; Domenico Siracusa

Sharpening Kubernetes Audit Logs with Context Awareness

Matteo Franzil, Valentino Armani, Luis Augusto Dias Knob, Domenico Siracusa

TL;DR

Kubernetes audit logs are expansive and loosely linked, hindering timely analysis. The authors introduce K8N-TEXT, a pipeline that uses a BiLSTM-based label predictor and a windowed clustering strategy to reconstruct contexts—linking triggering events with their cascading consequences—in real time. Extensive evaluation shows high macro-F1 accuracy, scalable inference, and substantial storage reductions, with effective clustering and query capabilities across scenarios. This work enables real-time, context-aware monitoring and threat detection in Kubernetes environments, reducing noise and improving incident analysis. Future directions include incremental learning and integration with broader security tooling.

Abstract

Kubernetes has emerged as the de facto orchestrator of microservices, providing scalability and extensibility to a highly dynamic environment. It builds an intricate and deeply connected system that requires extensive monitoring capabilities to be properly managed. To this account, K8s natively offers audit logs, a powerful feature for tracking API interactions in the cluster. Audit logs provide a detailed and chronological record of all activities in the system. Unfortunately, K8s auditing suffers from several practical limitations: it generates large volumes of data continuously, as all components within the cluster interact and respond to user actions. Moreover, each action can trigger a cascade of secondary events dispersed across the log, with little to no explicit linkage, making it difficult to reconstruct the context behind user-initiated operations. In this paper, we introduce K8NTEXT, a novel approach for streamlining K8s audit logs by reconstructing contexts, i.e., grouping actions performed by actors on the cluster with the subsequent events these actions cause. Correlated API calls are automatically identified, labeled, and consistently grouped using a combination of inference rules and a Machine Learning model, largely simplifying data consumption. We evaluate K8NTEXT's performance, scalability, and expressiveness both in systematic tests and with a series of use cases. We show that it consistently provides accurate context reconstruction, even for complex operations involving 50, 100 or more correlated actions, achieving over 95 percent accuracy across the entire spectrum, from simple to highly composite actions.

Sharpening Kubernetes Audit Logs with Context Awareness

TL;DR

Abstract

Sharpening Kubernetes Audit Logs with Context Awareness

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)