PACE: Poisoning Attacks on Learned Cardinality Estimation

Jintao Zhang; Chao Zhang; Guoliang Li; Chengliang Chai

PACE: Poisoning Attacks on Learned Cardinality Estimation

Jintao Zhang, Chao Zhang, Guoliang Li, Chengliang Chai

TL;DR

This paper proposes a poisoning attack system, PACE, which reduces the accuracy of the learned CE models by 178×, leading to a 10× decrease in the end-to-end performance of the target database.

Abstract

Cardinality estimation (CE) plays a crucial role in database optimizer. We have witnessed the emergence of numerous learned CE models recently which can outperform traditional methods such as histograms and samplings. However, learned models also bring many security risks. For example, a query-driven learned CE model learns a query-to-cardinality mapping based on the historical workload. Such a learned model could be attacked by poisoning queries, which are crafted by malicious attackers and woven into the historical workload, leading to performance degradation of CE. In this paper, we explore the potential security risks in learned CE and study a new problem of poisoning attacks on learned CE in a black-box setting. Experiments show that PACE reduces the accuracy of the learned CE models by 178 times, leading to a 10 times decrease in the end-to-end performance of the target database.

PACE: Poisoning Attacks on Learned Cardinality Estimation

TL;DR

This paper proposes a poisoning attack system, PACE, which reduces the accuracy of the learned CE models by 178×, leading to a 10× decrease in the end-to-end performance of the target database.

Abstract

Paper Structure (32 sections, 2 theorems, 12 equations, 15 figures, 10 tables, 1 algorithm)

This paper contains 32 sections, 2 theorems, 12 equations, 15 figures, 10 tables, 1 algorithm.

Introduction
Preliminaries
Query-driven Cardinality Estimation
Threat Model
Problem Definition
Related Work
$\mathtt{PACE}$ FrameWork
Overview
Surrogate CE Model Acquisition
Generator and Detector Training
Attacking
Surrogate CE Model Acquisition
Model Type Speculating of the Black Box
Training Strategy
Poisoning Query Generation
...and 17 more sections

Key Result

Lemma 1

The problem of poisoning query generation is a bivariate optimization problem that includes two variables, query generator $\mathcal{G}$ and poisoned model $w_p$. Particularly, $w_p$ is changing with $\mathcal{G}$ when maximizing the objective function.

Figures (15)

Figure 1: A example of a poisoning attack on a learned cardinality estimator.
Figure 2: System overview. (§ 3)
Figure 3: Training workflow of PACE.
Figure 4: Process of generating a poisoning query. (§ 5.2)
Figure 5: Analysis of the generator training. (§ 5.3)
...and 10 more figures

Theorems & Definitions (2)

Lemma 1: Bivariate optimization
Lemma 2: Algorithm Complexity

PACE: Poisoning Attacks on Learned Cardinality Estimation

TL;DR

Abstract

PACE: Poisoning Attacks on Learned Cardinality Estimation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (2)