Table of Contents
Fetching ...

FetaFix: Automatic Fault Localization and Repair of Deep Learning Model Conversions

Nikolaos Louloudakis, Perry Gibson, José Cano, Ajitha Rajan

TL;DR

FetaFix targets bugs that arise when translating DL models across frameworks by localizing faults in inputs, parameters, and the computation graph using ONNX representations and TVM-based activations. It then repairs offending components with a staged, greedy process and validates improvements on three image-classification models across four frameworks, achieving repairs in $462$ of $755$ detected faults and completely repairing $12$ of $15$ erroneous conversions. The approach covers six fault categories spanning preprocessing, input, tensor shape, weights/biases, hyperparameters, and graph structure, with strong gains from early input-based fixes and substantial gains from weights/biases repairs. This work enables more reliable cross-framework deployment and provides actionable feedback to converter tool developers.

Abstract

Converting deep learning models between frameworks is a common step to maximize model compatibility across devices and leverage optimization features that may be exclusively provided in one deep learning framework. However, this conversion process may be riddled with bugs, making the converted models either undeployable or problematic, considerably degrading their prediction correctness. In this paper, we propose an automated approach for fault localization and repair, FetaFix, during model conversion between deep learning frameworks. FetaFix is capable of detecting and fixing faults introduced in model input, parameters, hyperparameters, and the model graph during conversion. FetaFix uses a set of fault types (mined from surveying common conversion issues reported in code repositories and forums) to localize potential conversion faults in the converted target model and then repair them appropriately, e.g., replacing the parameters of the target model with those from the source model. This is done iteratively for every image in the dataset, comparing output label differences between the source model and the converted target model until all differences are resolved. We evaluate the effectiveness of FetaFix in fixing model conversion bugs of three widely used image recognition models converted across four different deep learning frameworks. Overall, FetaFix was able to fix $462$ out of $755$ detected conversion faults, either completely repairing or significantly improving the performance of $14$ out of the $15$ erroneous conversion cases.

FetaFix: Automatic Fault Localization and Repair of Deep Learning Model Conversions

TL;DR

FetaFix targets bugs that arise when translating DL models across frameworks by localizing faults in inputs, parameters, and the computation graph using ONNX representations and TVM-based activations. It then repairs offending components with a staged, greedy process and validates improvements on three image-classification models across four frameworks, achieving repairs in of detected faults and completely repairing of erroneous conversions. The approach covers six fault categories spanning preprocessing, input, tensor shape, weights/biases, hyperparameters, and graph structure, with strong gains from early input-based fixes and substantial gains from weights/biases repairs. This work enables more reliable cross-framework deployment and provides actionable feedback to converter tool developers.

Abstract

Converting deep learning models between frameworks is a common step to maximize model compatibility across devices and leverage optimization features that may be exclusively provided in one deep learning framework. However, this conversion process may be riddled with bugs, making the converted models either undeployable or problematic, considerably degrading their prediction correctness. In this paper, we propose an automated approach for fault localization and repair, FetaFix, during model conversion between deep learning frameworks. FetaFix is capable of detecting and fixing faults introduced in model input, parameters, hyperparameters, and the model graph during conversion. FetaFix uses a set of fault types (mined from surveying common conversion issues reported in code repositories and forums) to localize potential conversion faults in the converted target model and then repair them appropriately, e.g., replacing the parameters of the target model with those from the source model. This is done iteratively for every image in the dataset, comparing output label differences between the source model and the converted target model until all differences are resolved. We evaluate the effectiveness of FetaFix in fixing model conversion bugs of three widely used image recognition models converted across four different deep learning frameworks. Overall, FetaFix was able to fix out of detected conversion faults, either completely repairing or significantly improving the performance of out of the erroneous conversion cases.
Paper Structure (35 sections, 4 figures, 3 tables, 1 algorithm)

This paper contains 35 sections, 4 figures, 3 tables, 1 algorithm.

Figures (4)

  • Figure 1: Pairwise comparison of output labels between Source and converted Target models (from louloudakis2023deltann).
  • Figure 2: Fault localization and repair pipeline of FetaFix handling 6 fault types, three within Input-based category and three within Layer-based category.
  • Figure 3: Indicative example of differences in layer weights and hyperparameters, introduced in the model conversion process.
  • Figure 4: Model conversion cases that required an iterative weights and biases repair strategy with the percentage output label dissimilarity shown after each repair cycle (following preprocessing repair).