Table of Contents
Fetching ...

PACE: Poisoning Attacks on Learned Cardinality Estimation

Jintao Zhang, Chao Zhang, Guoliang Li, Chengliang Chai

TL;DR

This paper proposes a poisoning attack system, PACE, which reduces the accuracy of the learned CE models by 178×, leading to a 10× decrease in the end-to-end performance of the target database.

Abstract

Cardinality estimation (CE) plays a crucial role in database optimizer. We have witnessed the emergence of numerous learned CE models recently which can outperform traditional methods such as histograms and samplings. However, learned models also bring many security risks. For example, a query-driven learned CE model learns a query-to-cardinality mapping based on the historical workload. Such a learned model could be attacked by poisoning queries, which are crafted by malicious attackers and woven into the historical workload, leading to performance degradation of CE. In this paper, we explore the potential security risks in learned CE and study a new problem of poisoning attacks on learned CE in a black-box setting. Experiments show that PACE reduces the accuracy of the learned CE models by 178 times, leading to a 10 times decrease in the end-to-end performance of the target database.

PACE: Poisoning Attacks on Learned Cardinality Estimation

TL;DR

This paper proposes a poisoning attack system, PACE, which reduces the accuracy of the learned CE models by 178×, leading to a 10× decrease in the end-to-end performance of the target database.

Abstract

Cardinality estimation (CE) plays a crucial role in database optimizer. We have witnessed the emergence of numerous learned CE models recently which can outperform traditional methods such as histograms and samplings. However, learned models also bring many security risks. For example, a query-driven learned CE model learns a query-to-cardinality mapping based on the historical workload. Such a learned model could be attacked by poisoning queries, which are crafted by malicious attackers and woven into the historical workload, leading to performance degradation of CE. In this paper, we explore the potential security risks in learned CE and study a new problem of poisoning attacks on learned CE in a black-box setting. Experiments show that PACE reduces the accuracy of the learned CE models by 178 times, leading to a 10 times decrease in the end-to-end performance of the target database.
Paper Structure (32 sections, 2 theorems, 12 equations, 15 figures, 10 tables, 1 algorithm)

This paper contains 32 sections, 2 theorems, 12 equations, 15 figures, 10 tables, 1 algorithm.

Key Result

Lemma 1

The problem of poisoning query generation is a bivariate optimization problem that includes two variables, query generator $\mathcal{G}$ and poisoned model $w_p$. Particularly, $w_p$ is changing with $\mathcal{G}$ when maximizing the objective function.

Figures (15)

  • Figure 1: A example of a poisoning attack on a learned cardinality estimator.
  • Figure 2: System overview. (§ 3)
  • Figure 3: Training workflow of PACE.
  • Figure 4: Process of generating a poisoning query. (§ 5.2)
  • Figure 5: Analysis of the generator training. (§ 5.3)
  • ...and 10 more figures

Theorems & Definitions (2)

  • Lemma 1: Bivariate optimization
  • Lemma 2: Algorithm Complexity