Table of Contents
Fetching ...

HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption

Seewoo Lee, Garam Lee, Jung Woo Kim, Junbum Shin, Mun-Kyu Lee

TL;DR

HETAL delivers a practical CKKS-based framework for privacy-preserving transfer learning by performing encrypted training with early stopping and encrypted inference. The core contributions are DiagABT and DiagATB encrypted matrix multiplication schemes and a high-precision softmax approximation that extends the feasible domain to $[-128,128]$, enabling hundreds of training steps while maintaining accuracy close to plaintext. Empirical results on five benchmarks show training times under an hour and accuracy losses under 0.5% relative to plaintext training, demonstrating practical viability for MLaaS settings. This work advances privacy-preserving ML by reducing the computational barrier to encrypted training and enabling secure outsourcing of TL tasks with minimal performance penalties.

Abstract

Transfer learning is a de facto standard method for efficiently training machine learning models for data-scarce problems by adding and fine-tuning new classification layers to a model pre-trained on large datasets. Although numerous previous studies proposed to use homomorphic encryption to resolve the data privacy issue in transfer learning in the machine learning as a service setting, most of them only focused on encrypted inference. In this study, we present HETAL, an efficient Homomorphic Encryption based Transfer Learning algorithm, that protects the client's privacy in training tasks by encrypting the client data using the CKKS homomorphic encryption scheme. HETAL is the first practical scheme that strictly provides encrypted training, adopting validation-based early stopping and achieving the accuracy of nonencrypted training. We propose an efficient encrypted matrix multiplication algorithm, which is 1.8 to 323 times faster than prior methods, and a highly precise softmax approximation algorithm with increased coverage. The experimental results for five well-known benchmark datasets show total training times of 567-3442 seconds, which is less than an hour.

HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption

TL;DR

HETAL delivers a practical CKKS-based framework for privacy-preserving transfer learning by performing encrypted training with early stopping and encrypted inference. The core contributions are DiagABT and DiagATB encrypted matrix multiplication schemes and a high-precision softmax approximation that extends the feasible domain to , enabling hundreds of training steps while maintaining accuracy close to plaintext. Empirical results on five benchmarks show training times under an hour and accuracy losses under 0.5% relative to plaintext training, demonstrating practical viability for MLaaS settings. This work advances privacy-preserving ML by reducing the computational barrier to encrypted training and enabling secure outsourcing of TL tasks with minimal performance penalties.

Abstract

Transfer learning is a de facto standard method for efficiently training machine learning models for data-scarce problems by adding and fine-tuning new classification layers to a model pre-trained on large datasets. Although numerous previous studies proposed to use homomorphic encryption to resolve the data privacy issue in transfer learning in the machine learning as a service setting, most of them only focused on encrypted inference. In this study, we present HETAL, an efficient Homomorphic Encryption based Transfer Learning algorithm, that protects the client's privacy in training tasks by encrypting the client data using the CKKS homomorphic encryption scheme. HETAL is the first practical scheme that strictly provides encrypted training, adopting validation-based early stopping and achieving the accuracy of nonencrypted training. We propose an efficient encrypted matrix multiplication algorithm, which is 1.8 to 323 times faster than prior methods, and a highly precise softmax approximation algorithm with increased coverage. The experimental results for five well-known benchmark datasets show total training times of 567-3442 seconds, which is less than an hour.
Paper Structure (41 sections, 11 theorems, 58 equations, 6 figures, 12 tables, 5 algorithms)

This paper contains 41 sections, 11 theorems, 58 equations, 6 figures, 12 tables, 5 algorithms.

Key Result

Theorem 1

Let $p: \mathbb{R}^{c} \to \mathbb{R}^{c}$ be an approximation of the softmax on $[-R, R]^{c}$ satisfying Then for $\mathbf{x} \in [-\frac{1}{2}L^n R, \frac{1}{2}L^n R]^c$, we have where $\beta = \beta (\delta, c, r, L, d)$ is a constant that depends only on $\delta, c, r,L, d$.

Figures (6)

  • Figure 1: Our privacy-preserving transfer learning protocol (HETAL)
  • Figure 2: Demonstration of DiagABT algorithm when $A \in \mathbb{R}^{8 \times 8}$ and $B \in \mathbb{R}^{4 \times 8}$. We do not include the complexification optimization in the figure for simplicity.
  • Figure 3: Maximum and minimum value of input of softmax at each step (minibatch) for each dataset.
  • Figure 4: Maximum and minimum value of input of softmax at each step (minibatch) for each dataset, where the model is trained with vanilla SGD.
  • Figure 5: Encoding of a matrix $A \in \mathbb{R}^{13 \times 21}$ into 6 blocks where each encoded matrix of unit shape $8 \times 8$.
  • ...and 1 more figures

Theorems & Definitions (19)

  • Theorem
  • Proposition 4.1
  • Proposition 4.2
  • Lemma 1.1
  • proof
  • Lemma 1.2
  • proof
  • Lemma 1.3
  • proof
  • Lemma 1.4
  • ...and 9 more