Table of Contents
Fetching ...

Performance Prediction of On-NIC Network Functions with Multi-Resource Contention and Traffic Awareness

Shaofeng Wu, Qiang Su, Zhixiong Niu, Hong Xu

TL;DR

Yala tackles accurate performance prediction for on-NIC NFs under multi-resource contention and dynamic traffic. It introduces a divide-and-compose approach with per-resource contention models (hardware accelerators via a queueing-based white-box model and memory via gradient-boosted regression) and an execution-pattern-based composition, augmented by traffic-aware features and adaptive profiling. Empirical results on BlueField-2 show substantial improvements in prediction accuracy (average MAPE around $3.7\%$) and SLA adherence, enabling effective contention-aware NF placement and fast bottleneck diagnosis. The framework demonstrates strong potential for practical deployment and generalizes to other SoC SmartNICs, with open-source tooling and neural-guidance-free inference benefiting operators and researchers alike.

Abstract

Network function (NF) offloading on SmartNICs has been widely used in modern data centers, offering benefits in host resource saving and programmability. Co-running NFs on the same SmartNICs can cause performance interference due to contention of onboard resources. To meet performance SLAs while ensuring efficient resource management, operators need mechanisms to predict NF performance under such contention. However, existing solutions lack SmartNIC-specific knowledge and exhibit limited traffic awareness, leading to poor accuracy for on-NIC NFs. This paper proposes Yala, a novel performance predictive system for on-NIC NFs. Yala builds upon the key observation that co-located NFs contend for multiple resources, including onboard accelerators and the memory subsystem. It also facilitates traffic awareness according to the behaviors of individual resources to maintain accuracy as the external traffic attributes vary. Evaluation using BlueField-2 SmartNICs shows that Yala improves the prediction accuracy by 78.8% and reduces SLA violations by 92.2% compared to state-of-the-art approaches, and enables new practical usecases.

Performance Prediction of On-NIC Network Functions with Multi-Resource Contention and Traffic Awareness

TL;DR

Yala tackles accurate performance prediction for on-NIC NFs under multi-resource contention and dynamic traffic. It introduces a divide-and-compose approach with per-resource contention models (hardware accelerators via a queueing-based white-box model and memory via gradient-boosted regression) and an execution-pattern-based composition, augmented by traffic-aware features and adaptive profiling. Empirical results on BlueField-2 show substantial improvements in prediction accuracy (average MAPE around ) and SLA adherence, enabling effective contention-aware NF placement and fast bottleneck diagnosis. The framework demonstrates strong potential for practical deployment and generalizes to other SoC SmartNICs, with open-source tooling and neural-guidance-free inference benefiting operators and researchers alike.

Abstract

Network function (NF) offloading on SmartNICs has been widely used in modern data centers, offering benefits in host resource saving and programmability. Co-running NFs on the same SmartNICs can cause performance interference due to contention of onboard resources. To meet performance SLAs while ensuring efficient resource management, operators need mechanisms to predict NF performance under such contention. However, existing solutions lack SmartNIC-specific knowledge and exhibit limited traffic awareness, leading to poor accuracy for on-NIC NFs. This paper proposes Yala, a novel performance predictive system for on-NIC NFs. Yala builds upon the key observation that co-located NFs contend for multiple resources, including onboard accelerators and the memory subsystem. It also facilitates traffic awareness according to the behaviors of individual resources to maintain accuracy as the external traffic attributes vary. Evaluation using BlueField-2 SmartNICs shows that Yala improves the prediction accuracy by 78.8% and reduces SLA violations by 92.2% compared to state-of-the-art approaches, and enables new practical usecases.
Paper Structure (40 sections, 4 equations, 8 figures, 11 tables, 1 algorithm)

This paper contains 40 sections, 4 equations, 8 figures, 11 tables, 1 algorithm.

Figures (8)

  • Figure 1: Throughput drop ratios of some NFs from Table \ref{['tab:nfs']} under resource contention when co-located with at most 3 other random NFs.
  • Figure 2: Prediction errors (absolute percentage error) of FlowMonitor's throughput using single-resource models. (a) Box and whisker plot of using only the memory-based SLOMO model or a regex-based model (\ref{['design:reg_model']}). We show the median error on top of each error box. (b) Mean average percentage error (MAPE) of sum and min composition of single-resource models. NF1 and NF2 adopt run-to-completion and pipeline resource usage pattern respectively.
  • Figure 3: (a) FlowStats's throughput when the competitor's cache access rate (CAR) changes in three distinct traffic profiles. CAR is the sum of the cache read and write rates obtained from the hardware performance counters on BlueField-2. (b) Distribution of prediction errors after adapting the model to different traffic profiles. We show the median error on top of each error box.
  • Figure 4: Throughput of co-running synthetic pattern-matching "regex-NF" and regex-bench as a function of arrival rate of regex-bench. In each setting, regex-NF and regex-bench reach an equilibrium throughput, e.g., , with MTBR of 194 maches/MB for regex-NF, they both obtain 48.9 Mpps at equilibrium.
  • Figure 5: Throughput of two synthetic Click NFs that use pipeline (top) and run-to-completion (bottom) as a function of competing CAR in memory subsystem and match rate in regex accelerator.
  • ...and 3 more figures