GIST: Generated Inputs Sets Transferability in Deep Learning
Florian Tambon, Foutse Khomh, Giuliano Antoniol
TL;DR
GIST introduces Generated Inputs Sets Transferability to reduce the cost of test-set generation for multiple DNNs by transferring suitable test sets from benchmark models. It formalizes the transfer problem around a target transferable property ${\mathcal{P}}_O$ and a proxy ${\mathcal{P}}$, then validates this approach on neuron-coverage and fault-type-coverage across image and text modalities using offline correlation analysis and online test-set transfer. The study demonstrates that appropriate similarity metrics (e.g., PWCCA, J$_{Div}$) correlate with transferable properties, enabling effective test-set transfer via heuristics (Overall Best First and Each Best First) and outperform random selection in coverage while offering favorable trade-offs in execution time. The offline calibration is costly but amortizes over multiple transfers, making GIST practical for scalable DNN testing. The replication package and results suggest broad applicability to other properties and modalities with potential enhancements such as seed optimization and tailored similarity measures.
Abstract
To foster the verifiability and testability of Deep Neural Networks (DNN), an increasing number of methods for test case generation techniques are being developed. When confronted with testing DNN models, the user can apply any existing test generation technique. However, it needs to do so for each technique and each DNN model under test, which can be expensive. Therefore, a paradigm shift could benefit this testing process: rather than regenerating the test set independently for each DNN model under test, we could transfer from existing DNN models. This paper introduces GIST (Generated Inputs Sets Transferability), a novel approach for the efficient transfer of test sets. Given a property selected by a user (e.g., neurons covered, faults), GIST enables the selection of good test sets from the point of view of this property among available test sets. This allows the user to recover similar properties on the transferred test sets as he would have obtained by generating the test set from scratch with a test cases generation technique. Experimental results show that GIST can select effective test sets for the given property to transfer. Moreover, GIST scales better than reapplying test case generation techniques from scratch on DNN models under test.
