Table of Contents
Fetching ...

Bag of Lies: Robustness in Continuous Pre-training BERT

Ine Gevers, Walter Daelemans

TL;DR

This study investigates how continuous pre-training (CPT) updates BERT's entity knowledge about a post-training topic, COVID-19, using Check-COVID for fact-verification evaluation. It systematically tests CPT with diverse data sources (LitCovid articles, task-unlabeled Check-COVID data, Reddit) and adversarial inputs (AI-generated misinformation and word-order shuffling), across BERT-base and BERT-large. Findings show that CPT can improve downstream performance, even when the CPT data include misinformation, revealing surprising robustness of CPT to adversarial inputs. The work also releases a LitCovid-based paired dataset with AI-generated misinformation/paraphrase and provides code, offering guidance on data selection, robustness, and scalability of CPT for updating knowledge in language models.

Abstract

This study aims to acquire more insights into the continuous pre-training phase of BERT regarding entity knowledge, using the COVID-19 pandemic as a case study. Since the pandemic emerged after the last update of BERT's pre-training data, the model has little to no entity knowledge about COVID-19. Using continuous pre-training, we control what entity knowledge is available to the model. We compare the baseline BERT model with the further pre-trained variants on the fact-checking benchmark Check-COVID. To test the robustness of continuous pre-training, we experiment with several adversarial methods to manipulate the input data, such as training on misinformation and shuffling the word order until the input becomes nonsensical. Surprisingly, our findings reveal that these methods do not degrade, and sometimes even improve, the model's downstream performance. This suggests that continuous pre-training of BERT is robust against misinformation. Furthermore, we are releasing a new dataset, consisting of original texts from academic publications in the LitCovid repository and their AI-generated false counterparts.

Bag of Lies: Robustness in Continuous Pre-training BERT

TL;DR

This study investigates how continuous pre-training (CPT) updates BERT's entity knowledge about a post-training topic, COVID-19, using Check-COVID for fact-verification evaluation. It systematically tests CPT with diverse data sources (LitCovid articles, task-unlabeled Check-COVID data, Reddit) and adversarial inputs (AI-generated misinformation and word-order shuffling), across BERT-base and BERT-large. Findings show that CPT can improve downstream performance, even when the CPT data include misinformation, revealing surprising robustness of CPT to adversarial inputs. The work also releases a LitCovid-based paired dataset with AI-generated misinformation/paraphrase and provides code, offering guidance on data selection, robustness, and scalability of CPT for updating knowledge in language models.

Abstract

This study aims to acquire more insights into the continuous pre-training phase of BERT regarding entity knowledge, using the COVID-19 pandemic as a case study. Since the pandemic emerged after the last update of BERT's pre-training data, the model has little to no entity knowledge about COVID-19. Using continuous pre-training, we control what entity knowledge is available to the model. We compare the baseline BERT model with the further pre-trained variants on the fact-checking benchmark Check-COVID. To test the robustness of continuous pre-training, we experiment with several adversarial methods to manipulate the input data, such as training on misinformation and shuffling the word order until the input becomes nonsensical. Surprisingly, our findings reveal that these methods do not degrade, and sometimes even improve, the model's downstream performance. This suggests that continuous pre-training of BERT is robust against misinformation. Furthermore, we are releasing a new dataset, consisting of original texts from academic publications in the LitCovid repository and their AI-generated false counterparts.
Paper Structure (17 sections, 2 figures, 2 tables)

This paper contains 17 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Illustration of the model architecture.
  • Figure 2: Illustration of adversarial transformations of the input text.