Disposable-key-based image encryption for collaborative learning of Vision Transformer

Rei Aso; Sayaka Shiota; Hitoshi Kiya

Disposable-key-based image encryption for collaborative learning of Vision Transformer

Rei Aso, Sayaka Shiota, Hitoshi Kiya

TL;DR

This work tackles privacy-preserving collaborative learning for Vision Transformer by using learnable encryption to encrypt training images with per-image keys. Training proceeds on encrypted data transmitted only once to a central server, reducing communication and client-side computation compared to traditional federated approaches. The method employs block scrambling and pixel permutation via random permutation matrices, with a novel use of restricted random permutation matrices to mitigate accuracy loss while preserving privacy. Evaluations on CIFAR-10 demonstrate that ViT can be fine-tuned with encrypted data, and that restricting permutation matrices improves accuracy and security trade-offs, highlighting practical impact for privacy-conscious multi-client learning.

Abstract

We propose a novel method for securely training the vision transformer (ViT) with sensitive data shared from multiple clients similar to privacy-preserving federated learning. In the proposed method, training images are independently encrypted by each client where encryption keys can be prepared by each client, and ViT is trained by using these encrypted images for the first time. The method allows clients not only to dispose of the keys but to also reduce the communication costs between a central server and the clients. In image classification experiments, we verify the effectiveness of the proposed method on the CIFAR-10 dataset in terms of classification accuracy and the use of restricted random permutation matrices.

Disposable-key-based image encryption for collaborative learning of Vision Transformer

TL;DR

Abstract

Paper Structure (15 sections, 5 equations, 5 figures, 2 tables)

This paper contains 15 sections, 5 equations, 5 figures, 2 tables.

Introduction
Related Work
Vision transformer
Learnable image encryption
Proposed Method
Overview
Image encryption with random permutation matrices
Use of restricted random permutation matrices
Properties of proposed method
Experiment Result
Setup
Classification accuracy
Visibility of encrypted images
Security analysis
Conclusion

Figures (5)

Figure 1: Overview of ViT
Figure 2: Framework of proposed method
Figure 3: Example of encrypted images
Figure 4: Learning curve of training process
Figure 5: Example of encrypted images using restricted random permutation matrix

Disposable-key-based image encryption for collaborative learning of Vision Transformer

TL;DR

Abstract

Disposable-key-based image encryption for collaborative learning of Vision Transformer

Authors

TL;DR

Abstract

Table of Contents

Figures (5)