Table of Contents
Fetching ...

A One-Layer Decoder-Only Transformer is a Two-Layer RNN: With an Application to Certified Robustness

Yuhao Zhang, Aws Albarghouthi, Loris D'Antoni

TL;DR

ARC-Tran is proposed, a novel approach for verifying the robustness of decoder-only Transformers against arbitrary perturbation spaces and trains models more robust to arbitrary perturbation spaces than those produced by existing techniques.

Abstract

This paper reveals a key insight that a one-layer decoder-only Transformer is equivalent to a two-layer Recurrent Neural Network (RNN). Building on this insight, we propose ARC-Tran, a novel approach for verifying the robustness of decoder-only Transformers against arbitrary perturbation spaces. Compared to ARC-Tran, current robustness verification techniques are limited either to specific and length-preserving perturbations like word substitutions or to recursive models like LSTMs. ARC-Tran addresses these limitations by meticulously managing position encoding to prevent mismatches and by utilizing our key insight to achieve precise and scalable verification. Our evaluation shows that ARC-Tran (1) trains models more robust to arbitrary perturbation spaces than those produced by existing techniques and (2) shows high certification accuracy of the resulting models.

A One-Layer Decoder-Only Transformer is a Two-Layer RNN: With an Application to Certified Robustness

TL;DR

ARC-Tran is proposed, a novel approach for verifying the robustness of decoder-only Transformers against arbitrary perturbation spaces and trains models more robust to arbitrary perturbation spaces than those produced by existing techniques.

Abstract

This paper reveals a key insight that a one-layer decoder-only Transformer is equivalent to a two-layer Recurrent Neural Network (RNN). Building on this insight, we propose ARC-Tran, a novel approach for verifying the robustness of decoder-only Transformers against arbitrary perturbation spaces. Compared to ARC-Tran, current robustness verification techniques are limited either to specific and length-preserving perturbations like word substitutions or to recursive models like LSTMs. ARC-Tran addresses these limitations by meticulously managing position encoding to prevent mismatches and by utilizing our key insight to achieve precise and scalable verification. Our evaluation shows that ARC-Tran (1) trains models more robust to arbitrary perturbation spaces than those produced by existing techniques and (2) shows high certification accuracy of the resulting models.
Paper Structure (16 sections, 1 theorem, 17 equations, 1 figure, 1 table)

This paper contains 16 sections, 1 theorem, 17 equations, 1 figure, 1 table.

Key Result

Theorem 2.1

$f_n$ in eq: rnn_trans is equivalent to eq: def_trans_raw.

Figures (1)

  • Figure 1: A one-layer decoder-only Transformer is equivalent to a two-layer RNN.

Theorems & Definitions (6)

  • Theorem 2.1
  • Example 3.1
  • Example 3.2
  • Example 4.1
  • Example 4.2
  • proof