Table of Contents
Fetching ...

Overload: Latency Attacks on Object Detection for Edge Devices

Erh-Chung Chen, Pin-Yu Chen, I-Hsin Chung, Che-rung Lee

TL;DR

The paper addresses the risk that latency, not misclassification, can cripple real-time edge deployments of object detectors. It introduces Overload, a latency-attack framework that intentionally generates ghost objects to inflate NMS workload, guided by a simple objective that prioritizes maximizing object confidence and uses spatial attention to concentrate perturbations in sparse regions. Empirical results on Nvidia Jetson NX with YOLOv5 demonstrate about a tenfold increase in per-image inference time, with tens of thousands of ghost boxes produced in many cases, and the attack proving largely NMS-agnostic. The findings reveal a practical denial-of-service threat for edge devices and motivate defenses such as limiting object counts or enforcing timeouts, while also suggesting avenues for robust, black-box attack resistance in future work.

Abstract

Nowadays, the deployment of deep learning-based applications is an essential task owing to the increasing demands on intelligent services. In this paper, we investigate latency attacks on deep learning applications. Unlike common adversarial attacks for misclassification, the goal of latency attacks is to increase the inference time, which may stop applications from responding to the requests within a reasonable time. This kind of attack is ubiquitous for various applications, and we use object detection to demonstrate how such kind of attacks work. We also design a framework named Overload to generate latency attacks at scale. Our method is based on a newly formulated optimization problem and a novel technique, called spatial attention. This attack serves to escalate the required computing costs during the inference time, consequently leading to an extended inference time for object detection. It presents a significant threat, especially to systems with limited computing resources. We conducted experiments using YOLOv5 models on Nvidia NX. Compared to existing methods, our method is simpler and more effective. The experimental results show that with latency attacks, the inference time of a single image can be increased ten times longer in reference to the normal setting. Moreover, our findings pose a potential new threat to all object detection tasks requiring non-maximum suppression (NMS), as our attack is NMS-agnostic.

Overload: Latency Attacks on Object Detection for Edge Devices

TL;DR

The paper addresses the risk that latency, not misclassification, can cripple real-time edge deployments of object detectors. It introduces Overload, a latency-attack framework that intentionally generates ghost objects to inflate NMS workload, guided by a simple objective that prioritizes maximizing object confidence and uses spatial attention to concentrate perturbations in sparse regions. Empirical results on Nvidia Jetson NX with YOLOv5 demonstrate about a tenfold increase in per-image inference time, with tens of thousands of ghost boxes produced in many cases, and the attack proving largely NMS-agnostic. The findings reveal a practical denial-of-service threat for edge devices and motivate defenses such as limiting object counts or enforcing timeouts, while also suggesting avenues for robust, black-box attack resistance in future work.

Abstract

Nowadays, the deployment of deep learning-based applications is an essential task owing to the increasing demands on intelligent services. In this paper, we investigate latency attacks on deep learning applications. Unlike common adversarial attacks for misclassification, the goal of latency attacks is to increase the inference time, which may stop applications from responding to the requests within a reasonable time. This kind of attack is ubiquitous for various applications, and we use object detection to demonstrate how such kind of attacks work. We also design a framework named Overload to generate latency attacks at scale. Our method is based on a newly formulated optimization problem and a novel technique, called spatial attention. This attack serves to escalate the required computing costs during the inference time, consequently leading to an extended inference time for object detection. It presents a significant threat, especially to systems with limited computing resources. We conducted experiments using YOLOv5 models on Nvidia NX. Compared to existing methods, our method is simpler and more effective. The experimental results show that with latency attacks, the inference time of a single image can be increased ten times longer in reference to the normal setting. Moreover, our findings pose a potential new threat to all object detection tasks requiring non-maximum suppression (NMS), as our attack is NMS-agnostic.
Paper Structure (24 sections, 8 equations, 7 figures, 11 tables, 1 algorithm)

This paper contains 24 sections, 8 equations, 7 figures, 11 tables, 1 algorithm.

Figures (7)

  • Figure 1: The processing flow of object detection. NMS stands for non-maximum suppression.
  • Figure 2: Elapsed time of NMS on NVIDIA Jetson NX.
  • Figure 3: The execution flow of spatial attention.
  • Figure 4: The outputs of the adversarial examples by Retinanet. \ref{['fig:example_retinanet_raw']} and \ref{['fig:example_retinanet_overload']} are generated by the normal PGD attack and Overload attack, respectively.
  • Figure 5: The outputs of the adversarial examples by FCOS. \ref{['fig:example_focs_raw']} and \ref{['fig:example_focs_overload']} are generated by the normal PGD attack and Overload attack, respectively.
  • ...and 2 more figures