Designing DNNs for a trade-off between robustness and processing performance in embedded devices
Jon Gutiérrez-Zaballa, Koldo Basterretxea, Javier Echanobe
TL;DR
The paper tackles the problem of maintaining DNN resilience against soft errors in safety-critical edge applications while applying compression techniques. It compares bounded activation functions (Sigmoid and Hard Sigmoid) with ReLU in an encoder-decoder U-Net for hyperspectral semantic segmentation and evaluates robustness via fault injection and deployment metrics. The study covers noncompressed, pruned, and pruned-quantized models, including FPGA-based deployment on a KV260 SoM, and reports IoU, throughput, and power data. The results indicate that bounded AFs improve robustness to SBUs, but aggressive pruning and quantization shift the optimal design; ReLU yields the best IoU and efficiency, while Hard Sigmoid offers a useful robustness-performance compromise.
Abstract
Machine learning-based embedded systems employed in safety-critical applications such as aerospace and autonomous driving need to be robust against perturbations produced by soft errors. Soft errors are an increasing concern in modern digital processors since smaller transistor geometries and lower voltages give electronic devices a higher sensitivity to background radiation. The resilience of deep neural network (DNN) models to perturbations in their parameters is determined, to a large extent, by the structure of the model itself, and also by the selected numerical representation and used arithmetic precision. When compression techniques such as model pruning and model quantization are applied to reduce memory footprint and computational complexity for deployment, both model structure and numerical representation are modified and thus, soft error robustness also changes. In this sense, although the choice of activation functions (AFs) in DNN models is frequently ignored, it conditions not only their accuracy and trainability, but also compressibility rates and numerical robustness. This paper investigates the suitability of using bounded AFs to improve model robustness against DNN parameter perturbations, assessing at the same time the impact of this choice on deployment in terms of model accuracy, compressibility, and computational burden. In particular, we analyze encoder-decoder fully convolutional models aimed at performing semantic segmentation tasks on hyperspectral images for scene understanding in autonomous driving. Deployment characterization is performed experimentally on an AMD-Xilinx's KV260 SoM.
