Contextual Clarity: Generating Sentences with Transformer Models using Context-Reverso Data
Ruslan Musaev
TL;DR
The paper tackles the task of generating informative, unambiguous sentence-contexts for given keywords (Keyword in Context) by building a dataset from the Context-Reverso API and fine-tuning transformer-based models. Using T5-small and especially T5-base, the authors show that context-rich sentence generation improves over a GPT-2 baseline, as measured by BLEU and METEOR on two dataset scales (10K and 1M samples). The work demonstrates practical applicability by deploying a Telegram bot for word learning that leverages generated contexts. The approach highlights the value of external data sources for training context-aware generation systems and provides code for reproducibility.
Abstract
In the age of information abundance, the ability to provide users with contextually relevant and concise information is crucial. Keyword in Context (KIC) generation is a task that plays a vital role in and generation applications, such as search engines, personal assistants, and content summarization. In this paper, we present a novel approach to generating unambiguous and brief sentence-contexts for given keywords using the T5 transformer model, leveraging data obtained from the Context-Reverso API. The code is available at https://github.com/Rusamus/word2context/tree/main .
