Observability and Incident Response in Managed Serverless Environments Using Ontology-Based Log Monitoring
Lavi Ben-Shimol, Edita Grolman, Aviad Elyashar, Inbar Maimon, Dudu Mimran, Oleg Brodt, Martin Strassmann, Heiko Lehmann, Yuval Elovici, Asaf Shabtai
TL;DR
The paper addresses limited observability and incident response in fully managed serverless environments by introducing the Perimeterless stack, a three-layer security framework built around a serverless ontology and a graph-based knowledge representation of application activity. It comprises a generic serverless ontology, a CSP-specific activity knowledge graph pipeline, and two situational-awareness tools: an incident response dashboard and a Criticality of Asset (CoA) risk assessment framework, implemented on AWS CloudTrail data and demonstrated with an Airline Booking PoC. Foundational experiments include a user study (n=39) showing the IR dashboard reduces mean time-to-detect and increases accuracy (up to 18% improvement) and an expert-driven CoA ranking with Kendall’s W around 0.72, indicating strong cross-annotator agreement. The approach yields CSP-agnostic, low-overhead observability enhancements and scalable ore for developing additional CSA tools, with public release of the log dataset and clear paths for cross-CSP extension and proactive threat hunting.
Abstract
In a fully managed serverless environment, the cloud service provider is responsible for securing the cloud infrastructure, thereby reducing the operational and maintenance efforts of application developers. However, this environment limits the use of existing cybersecurity frameworks and tools, which reduces observability and situational awareness capabilities (e.g., risk assessment, incident response). In addition, existing security frameworks for serverless applications do not generalize well to all application architectures and usually require adaptation, specialized expertise, etc. for use in fully managed serverless environments. In this paper, we introduce a three-layer security scheme for applications deployed in fully managed serverless environments. The first two layers involve a unique ontology based solely on serverless logs which is used to transform them into a unified application activity knowledge graph. In the third layer, we address the need for observability and situational awareness capabilities by implementing two situational awareness tools that utilizes the graph-based representation: 1) An incident response dashboard that leverages the ontology to visualize and examine application activity logs in the context of cybersecurity alerts. Our user study showed that the dashboard enabled participants to respond more accurately and quickly to new security alerts than the baseline tool. 2) A criticality of asset (CoA) risk assessment framework that enables efficient expert-based prioritization in cybersecurity contexts.
