Table of Contents
Fetching ...

Python Fuzzing for Trustworthy Machine Learning Frameworks

Ilya Yegorov, Eli Kobrin, Darya Parygina, Alexey Vishnyakov, Andrey Fedotov

TL;DR

The paper tackles security and reliability gaps in popular ML frameworks (PyTorch, TensorFlow, h5py) by applying dynamic analysis through fuzzing to Python code and bindings. It proposes a Python-centric analysis pipeline built on Sydr-Fuzz, integrating fuzzing, corpus minimization, coverage collection, crash triaging, and continuous integration via GitLab CI. The evaluation across PyTorch, TensorFlow, and h5py yields three new bugs, with crashes triaged and prioritized by Casr, and several patches submitted. The work demonstrates a practical approach to enhancing trustworthy ML software and highlights how attack-surface-oriented fuzzing of Python APIs and bindings can improve security in real-world frameworks.

Abstract

Ensuring the security and reliability of machine learning frameworks is crucial for building trustworthy AI-based systems. Fuzzing, a popular technique in secure software development lifecycle (SSDLC), can be used to develop secure and robust software. Popular machine learning frameworks such as PyTorch and TensorFlow are complex and written in multiple programming languages including C/C++ and Python. We propose a dynamic analysis pipeline for Python projects using the Sydr-Fuzz toolset. Our pipeline includes fuzzing, corpus minimization, crash triaging, and coverage collection. Crash triaging and severity estimation are important steps to ensure that the most critical vulnerabilities are addressed promptly. Furthermore, the proposed pipeline is integrated in GitLab CI. To identify the most vulnerable parts of the machine learning frameworks, we analyze their potential attack surfaces and develop fuzz targets for PyTorch, TensorFlow, and related projects such as h5py. Applying our dynamic analysis pipeline to these targets, we were able to discover 3 new bugs and propose fixes for them.

Python Fuzzing for Trustworthy Machine Learning Frameworks

TL;DR

The paper tackles security and reliability gaps in popular ML frameworks (PyTorch, TensorFlow, h5py) by applying dynamic analysis through fuzzing to Python code and bindings. It proposes a Python-centric analysis pipeline built on Sydr-Fuzz, integrating fuzzing, corpus minimization, coverage collection, crash triaging, and continuous integration via GitLab CI. The evaluation across PyTorch, TensorFlow, and h5py yields three new bugs, with crashes triaged and prioritized by Casr, and several patches submitted. The work demonstrates a practical approach to enhancing trustworthy ML software and highlights how attack-surface-oriented fuzzing of Python APIs and bindings can improve security in real-world frameworks.

Abstract

Ensuring the security and reliability of machine learning frameworks is crucial for building trustworthy AI-based systems. Fuzzing, a popular technique in secure software development lifecycle (SSDLC), can be used to develop secure and robust software. Popular machine learning frameworks such as PyTorch and TensorFlow are complex and written in multiple programming languages including C/C++ and Python. We propose a dynamic analysis pipeline for Python projects using the Sydr-Fuzz toolset. Our pipeline includes fuzzing, corpus minimization, crash triaging, and coverage collection. Crash triaging and severity estimation are important steps to ensure that the most critical vulnerabilities are addressed promptly. Furthermore, the proposed pipeline is integrated in GitLab CI. To identify the most vulnerable parts of the machine learning frameworks, we analyze their potential attack surfaces and develop fuzz targets for PyTorch, TensorFlow, and related projects such as h5py. Applying our dynamic analysis pipeline to these targets, we were able to discover 3 new bugs and propose fixes for them.
Paper Structure (19 sections, 1 table)