Constrained Adversarial Learning for Automated Software Testing: a literature review
João Vitorino, Tiago Dias, Tiago Fonseca, Eva Maia, Isabel Praça
TL;DR
This literature review addresses how constrained adversarial data generation can improve automated software testing. It surveys the state of constrained data generation in adversarial ML (RQ1) and its applicability to testing (RQ2), using a PRISMA-guided methodology over 2017–2023. Key findings include 98 constrained-generation applications across 11 sectors and 17 software-testing publications, with white-box methods and GAN-based approaches showing notable promise for increasing test coverage, while black-box/grey-box contexts remain challenging. The work highlights transfer opportunities, identifies gaps (especially for non-white-box settings), and proposes future directions such as NLP-driven constraint extraction and hybrid methods to enhance test data quality and resilience of information systems.
Abstract
It is imperative to safeguard computer applications and information systems against the growing number of cyber-attacks. Automated software testing tools can be developed to quickly analyze many lines of code and detect vulnerabilities by generating function-specific testing data. This process draws similarities to the constrained adversarial examples generated by adversarial machine learning methods, so there could be significant benefits to the integration of these methods in testing tools to identify possible attack vectors. Therefore, this literature review is focused on the current state-of-the-art of constrained data generation approaches applied for adversarial learning and software testing, aiming to guide researchers and developers to enhance their software testing tools with adversarial testing methods and improve the resilience and robustness of their information systems. The found approaches were systematized, and the advantages and limitations of those specific for white-box, grey-box, and black-box testing were analyzed, identifying research gaps and opportunities to automate the testing tools with data generated by adversarial attacks.
