GIST: Generated Inputs Sets Transferability in Deep Learning

Florian Tambon; Foutse Khomh; Giuliano Antoniol

GIST: Generated Inputs Sets Transferability in Deep Learning

Florian Tambon, Foutse Khomh, Giuliano Antoniol

TL;DR

GIST introduces Generated Inputs Sets Transferability to reduce the cost of test-set generation for multiple DNNs by transferring suitable test sets from benchmark models. It formalizes the transfer problem around a target transferable property ${\mathcal{P}}_O$ and a proxy ${\mathcal{P}}$, then validates this approach on neuron-coverage and fault-type-coverage across image and text modalities using offline correlation analysis and online test-set transfer. The study demonstrates that appropriate similarity metrics (e.g., PWCCA, J$_{Div}$) correlate with transferable properties, enabling effective test-set transfer via heuristics (Overall Best First and Each Best First) and outperform random selection in coverage while offering favorable trade-offs in execution time. The offline calibration is costly but amortizes over multiple transfers, making GIST practical for scalable DNN testing. The replication package and results suggest broad applicability to other properties and modalities with potential enhancements such as seed optimization and tailored similarity measures.

Abstract

To foster the verifiability and testability of Deep Neural Networks (DNN), an increasing number of methods for test case generation techniques are being developed. When confronted with testing DNN models, the user can apply any existing test generation technique. However, it needs to do so for each technique and each DNN model under test, which can be expensive. Therefore, a paradigm shift could benefit this testing process: rather than regenerating the test set independently for each DNN model under test, we could transfer from existing DNN models. This paper introduces GIST (Generated Inputs Sets Transferability), a novel approach for the efficient transfer of test sets. Given a property selected by a user (e.g., neurons covered, faults), GIST enables the selection of good test sets from the point of view of this property among available test sets. This allows the user to recover similar properties on the transferred test sets as he would have obtained by generating the test set from scratch with a test cases generation technique. Experimental results show that GIST can select effective test sets for the given property to transfer. Moreover, GIST scales better than reapplying test case generation techniques from scratch on DNN models under test.

GIST: Generated Inputs Sets Transferability in Deep Learning

TL;DR

and a proxy

, then validates this approach on neuron-coverage and fault-type-coverage across image and text modalities using offline correlation analysis and online test-set transfer. The study demonstrates that appropriate similarity metrics (e.g., PWCCA, J

) correlate with transferable properties, enabling effective test-set transfer via heuristics (Overall Best First and Each Best First) and outperform random selection in coverage while offering favorable trade-offs in execution time. The offline calibration is costly but amortizes over multiple transfers, making GIST practical for scalable DNN testing. The replication package and results suggest broad applicability to other properties and modalities with potential enhancements such as seed optimization and tailored similarity measures.

Abstract

Paper Structure (31 sections, 9 equations, 9 figures, 9 tables, 2 algorithms)

This paper contains 31 sections, 9 equations, 9 figures, 9 tables, 2 algorithms.

Introduction
Methodology
Test set transferability problem
GIST: Generated Inputs Sets Transferability in Deep Learning
Property $\mathcal{P}_O$
Neuron Coverage based
Fault-Types Coverage based
Proxy $\mathcal{P}$
Projection Weighted Canonical Correlation Analysis (PWCCA)
Centered Kernel Alignment (CKA)
Procrustes Orthogonal (Ortho)
Performance (Acc)
Disagreement (Dis)
Divergence (Div)
Experimental Design
...and 16 more sections

Figures (9)

Figure 1: Transferring vs Generating test cases. (Scenario A) When generating the test cases, we apply the test cases generation technique on the DNN model under test which can be time-consuming and must be reapplied for any new DNN model under test and generation technique. (Scenario B) On the contrary, by transferring, we can reuse available test sets to obtain transferred test sets. Those transferred test sets should match what we would have obtained in terms of a desired property, for instance, the types of faults of the DNN model under test. Each black circle in the property space represents a cluster of fault types for the DNN model under test. Data points circled in dashed lines outside of those clusters are not faults for the DNN model under test.
Figure 2: General idea of GIST for test sets transferability. (Left) Offline computation of GIST to determine the correct proxy $\mathcal{P}$. (Right) Online mode to apply on a new DNN model under test using the proxy.
Figure 3: Relative $\mathcal{P}_O$ for the $T_O$ of each DNN model under test when applying different $T_R$ from reference DNN models. The red darker the colour, the higher the relative $\mathcal{P}_O$.
Figure 4: Average ranking of closest reference DNN model types in terms of $\mathcal{P}_O$ and similarity on image dataset according to a given DNN model under test. Colours indicate ranking (green = 1$^{st}$, red = 4$^{th}$).
Figure 5: Average ranking of closest reference DNN model types in terms of $\mathcal{P}_O$ and similarity on text dataset according to a given DNN model under test. Colours indicate ranking (green = 1$^{st}$, red = 4$^{th}$).
...and 4 more figures

GIST: Generated Inputs Sets Transferability in Deep Learning

TL;DR

Abstract

GIST: Generated Inputs Sets Transferability in Deep Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (9)