ByteShield: Adversarially Robust End-to-End Malware Detection through Byte Masking
Daniel Gibert, Felip Manyà
TL;DR
ByteShield introduces a deterministic byte-masking defense for end-to-end malware detectors that masks input regions with a sliding window, generates multiple masked versions, and uses a threshold-based voting rule to classify. By ensuring adversarial payloads are occluded in at least one masked version, it achieves strong robustness against functionality-preserving attacks while maintaining high accuracy on clean data and offering faster inference than smoothing-based defenses. Across EMBER and BODMAS benchmarks, ByteShield outperforms randomized and (de)randomized smoothing defenses, demonstrates resilience to diverse payload-injection attacks, and shows solid temporal robustness with manageable computational costs. The approach hinges on robust masked-training and a conservative voting strategy to mitigate false positives and maintain reliable malware detection under adversarial pressure.
Abstract
Research has proven that end-to-end malware detectors are vulnerable to adversarial attacks. In response, the research community has proposed defenses based on randomized and (de)randomized smoothing. However, these techniques remain susceptible to attacks that insert large adversarial payloads. To address these limitations, we propose a novel defense mechanism designed to harden end-to-end malware detectors by leveraging masking at the byte level. This mechanism operates by generating multiple masked versions of the input file, independently classifying each version, and then applying a threshold-based voting mechanism to produce the final classification. Key to this defense is a deterministic masking strategy that systematically strides a mask across the entire input file. Unlike randomized smoothing defenses, which randomly mask or delete bytes, this structured approach ensures coverage of the file over successive versions. In the best-case scenario, this strategy fully occludes the adversarial payload, effectively neutralizing its influence on the model's decision. In the worst-case scenario, it partially occludes the adversarial payload, reducing its impact on the model's predictions. By occluding the adversarial payload in one or more masked versions, this defense ensures that some input versions remain representative of the file's original intent, allowing the voting mechanism to suppress the influence of the adversarial payload. Results achieved on the EMBER and BODMAS datasets demonstrate the suitability of our defense, outperforming randomized and (de)randomized smoothing defenses against adversarial examples generated with a wide range of functionality-preserving manipulations while maintaining high accuracy on clean examples.
