How to Query Language Models?
Leonard Adolphs, Shehzaad Dhuliawala, Thomas Hofmann
TL;DR
The paper investigates how to extract factual and commonsense knowledge from masked language models without fine-tuning. It introduces querying by example, appending demonstrations of the same relation to cloze prompts to disambiguate the information need, and evaluates on LAMA, TextWorld Commonsense, and BATS. Results show substantial gains on T-REx and ConceptNet with about 10 demonstrations, while Google-RE can be adversely affected, highlighting dataset-dependent dynamics; the improvements are mediated by embedding-space disambiguation and are achieved with a single forward pass. Overall, the work demonstrates a simple, efficient prompting strategy that uncovers more latent knowledge in LMs and informs practical prompt design for knowledge retrieval tasks.
Abstract
Large pre-trained language models (LMs) are capable of not only recovering linguistic but also factual and commonsense knowledge. To access the knowledge stored in mask-based LMs, we can use cloze-style questions and let the model fill in the blank. The flexibility advantage over structured knowledge bases comes with the drawback of finding the right query for a certain information need. Inspired by human behavior to disambiguate a question, we propose to query LMs by example. To clarify the ambivalent question "Who does Neuer play for?", a successful strategy is to demonstrate the relation using another subject, e.g., "Ronaldo plays for Portugal. Who does Neuer play for?". We apply this approach of querying by example to the LAMA probe and obtain substantial improvements of up to 37.8% for BERT-large on the T-REx data when providing only 10 demonstrations--even outperforming a baseline that queries the model with up to 40 paraphrases of the question. The examples are provided through the model's context and thus require neither fine-tuning nor an additional forward pass. This suggests that LMs contain more factual and commonsense knowledge than previously assumed--if we query the model in the right way.
