Opacus: User-Friendly Differential Privacy Library in PyTorch
Ashkan Yousefpour, Igor Shilov, Alexandre Sablayrolles, Davide Testuggine, Karthik Prasad, Mani Malek, John Nguyen, Sayan Ghosh, Akash Bharadwaj, Jessica Zhao, Graham Cormode, Ilya Mironov
TL;DR
Opacus presents a user-friendly PyTorch library for training with differential privacy using vectorized per-sample gradients and built-in privacy accounting. Through PrivacyEngine and GradSampleModule, it enables DP-SGD with minimal code changes and demonstrates competitive runtime and memory performance against existing DP frameworks across diverse models and datasets. The work highlights the advantages of vectorization over micro-batching, extensive benchmarking, and practical integration with PyTorch ecosystems, underscoring Opacus as a scalable tool for privacy-preserving deep learning. It is open-source and actively maintained, with plans for further flexibility and ecosystem integration.
Abstract
We introduce Opacus, a free, open-source PyTorch library for training deep learning models with differential privacy (hosted at opacus.ai). Opacus is designed for simplicity, flexibility, and speed. It provides a simple and user-friendly API, and enables machine learning practitioners to make a training pipeline private by adding as little as two lines to their code. It supports a wide variety of layers, including multi-head attention, convolution, LSTM, GRU (and generic RNN), and embedding, right out of the box and provides the means for supporting other user-defined layers. Opacus computes batched per-sample gradients, providing higher efficiency compared to the traditional "micro batch" approach. In this paper we present Opacus, detail the principles that drove its implementation and unique features, and benchmark it against other frameworks for training models with differential privacy as well as standard PyTorch.
