Table of Contents
Fetching ...

eBeeMetrics: An eBPF-based Library Framework for Feedback-free Observability of QoS Metrics

Muntaka Ibnath, Mohammadreza Rezvani, Daniel Wong

Abstract

Many system management runtimes (SMRs), such as resource management and power management techniques, rely on quality-of-service (QoS) metrics, such as tail latency or throughput, as feedback. These QoS metrics are generally neither observable with hardware performance counters nor directly observable within the OS kernel. This introduces complexity and overhead in instrumenting the application and integrating QoS performance metric feedback with many management runtimes. To bridge this gap, we introduced eBeeMetrics, an eBPF-based library framework to accurately observe application-level metrics derived from only eBPF-observable events, such as system calls. eBeeMetrics can be used as a drop-in replacement to decouple system management runtimes from QoS metric feedback reporting, or can supplement existing QoS metrics to better identify server-side dynamics. eBeeMetrics achieves a strong correlation with real-world measured throughput and latency metrics across various latency-sensitive workloads. The eBeeMetrics tool is open-source; the source code is available at: https://github.com/Ibnathism/eBeeMetrics.

eBeeMetrics: An eBPF-based Library Framework for Feedback-free Observability of QoS Metrics

Abstract

Many system management runtimes (SMRs), such as resource management and power management techniques, rely on quality-of-service (QoS) metrics, such as tail latency or throughput, as feedback. These QoS metrics are generally neither observable with hardware performance counters nor directly observable within the OS kernel. This introduces complexity and overhead in instrumenting the application and integrating QoS performance metric feedback with many management runtimes. To bridge this gap, we introduced eBeeMetrics, an eBPF-based library framework to accurately observe application-level metrics derived from only eBPF-observable events, such as system calls. eBeeMetrics can be used as a drop-in replacement to decouple system management runtimes from QoS metric feedback reporting, or can supplement existing QoS metrics to better identify server-side dynamics. eBeeMetrics achieves a strong correlation with real-world measured throughput and latency metrics across various latency-sensitive workloads. The eBeeMetrics tool is open-source; the source code is available at: https://github.com/Ibnathism/eBeeMetrics.

Paper Structure

This paper contains 47 sections, 11 figures, 2 tables.

Figures (11)

  • Figure 1: (a) Prior work Paper:kernelobservability demonstrates observability of throughput-based proxy metrics through offline analysis of eBPF-provided traces. (b) eBeeMetrics enables a fully online solution for observability of both throughput and latency metrics.
  • Figure 2: Key syscalls from request processing using HTTP/1.1 protocol. Syscalls from different worker threads can be interleaved, obfuscating request boundaries due to syscalls only preserving process association (PID) and not threads (TID).
  • Figure 3: Detailed overview of eBeeMetrics.
  • Figure 4: Client-Server communication protocols of HTTP/1.1 and HTTP/2.
  • Figure 5: Example of raw eBPF syscall trace logs from eBPF trace pipe. A key insight of eBeeMetrics is in identifying metadata from eBPF observable events that can allow us to tease out individual requests and their timing.
  • ...and 6 more figures