Table of Contents
Fetching ...

Sharpening Kubernetes Audit Logs with Context Awareness

Matteo Franzil, Valentino Armani, Luis Augusto Dias Knob, Domenico Siracusa

TL;DR

Kubernetes audit logs are expansive and loosely linked, hindering timely analysis. The authors introduce K8N-TEXT, a pipeline that uses a BiLSTM-based label predictor and a windowed clustering strategy to reconstruct contexts—linking triggering events with their cascading consequences—in real time. Extensive evaluation shows high macro-F1 accuracy, scalable inference, and substantial storage reductions, with effective clustering and query capabilities across scenarios. This work enables real-time, context-aware monitoring and threat detection in Kubernetes environments, reducing noise and improving incident analysis. Future directions include incremental learning and integration with broader security tooling.

Abstract

Kubernetes has emerged as the de facto orchestrator of microservices, providing scalability and extensibility to a highly dynamic environment. It builds an intricate and deeply connected system that requires extensive monitoring capabilities to be properly managed. To this account, K8s natively offers audit logs, a powerful feature for tracking API interactions in the cluster. Audit logs provide a detailed and chronological record of all activities in the system. Unfortunately, K8s auditing suffers from several practical limitations: it generates large volumes of data continuously, as all components within the cluster interact and respond to user actions. Moreover, each action can trigger a cascade of secondary events dispersed across the log, with little to no explicit linkage, making it difficult to reconstruct the context behind user-initiated operations. In this paper, we introduce K8NTEXT, a novel approach for streamlining K8s audit logs by reconstructing contexts, i.e., grouping actions performed by actors on the cluster with the subsequent events these actions cause. Correlated API calls are automatically identified, labeled, and consistently grouped using a combination of inference rules and a Machine Learning model, largely simplifying data consumption. We evaluate K8NTEXT's performance, scalability, and expressiveness both in systematic tests and with a series of use cases. We show that it consistently provides accurate context reconstruction, even for complex operations involving 50, 100 or more correlated actions, achieving over 95 percent accuracy across the entire spectrum, from simple to highly composite actions.

Sharpening Kubernetes Audit Logs with Context Awareness

TL;DR

Kubernetes audit logs are expansive and loosely linked, hindering timely analysis. The authors introduce K8N-TEXT, a pipeline that uses a BiLSTM-based label predictor and a windowed clustering strategy to reconstruct contexts—linking triggering events with their cascading consequences—in real time. Extensive evaluation shows high macro-F1 accuracy, scalable inference, and substantial storage reductions, with effective clustering and query capabilities across scenarios. This work enables real-time, context-aware monitoring and threat detection in Kubernetes environments, reducing noise and improving incident analysis. Future directions include incremental learning and integration with broader security tooling.

Abstract

Kubernetes has emerged as the de facto orchestrator of microservices, providing scalability and extensibility to a highly dynamic environment. It builds an intricate and deeply connected system that requires extensive monitoring capabilities to be properly managed. To this account, K8s natively offers audit logs, a powerful feature for tracking API interactions in the cluster. Audit logs provide a detailed and chronological record of all activities in the system. Unfortunately, K8s auditing suffers from several practical limitations: it generates large volumes of data continuously, as all components within the cluster interact and respond to user actions. Moreover, each action can trigger a cascade of secondary events dispersed across the log, with little to no explicit linkage, making it difficult to reconstruct the context behind user-initiated operations. In this paper, we introduce K8NTEXT, a novel approach for streamlining K8s audit logs by reconstructing contexts, i.e., grouping actions performed by actors on the cluster with the subsequent events these actions cause. Correlated API calls are automatically identified, labeled, and consistently grouped using a combination of inference rules and a Machine Learning model, largely simplifying data consumption. We evaluate K8NTEXT's performance, scalability, and expressiveness both in systematic tests and with a series of use cases. We show that it consistently provides accurate context reconstruction, even for complex operations involving 50, 100 or more correlated actions, achieving over 95 percent accuracy across the entire spectrum, from simple to highly composite actions.

Paper Structure

This paper contains 52 sections, 13 figures, 7 tables.

Figures (13)

  • Figure 1: A screenshot of a k8s audit log line of a Deployment action.
  • Figure 2: Simplified overview of the interactions in a k8s cluster. Yellow components are the possible outputs of the audit logging feature.
  • Figure 3: Interactions between components when creating a namespace. On the left side, the various user agents are shown. Each outer box represents a namespace, while inner boxes represent resource types and dark boxes represent objects. Arrows represent interactions between components, with the verb used in the action.
  • Figure 4: System architecture of K8N-TEXT. Dotted lines indicate the flow for the training phase, while solid lines indicate the flow for the inference phase. Black boxes represent the dl components, blue (dark) boxes represent the k8s components, and the yellow (light) boxes represent all other log-handling components. For clarity, the query engine is optional and is left out of the figure.
  • Figure 5: The architecture of the deep learning model used for label prediction.
  • ...and 8 more figures