Towards Knowledge-Grounded Natural Language Understanding and Generation
Chenxi Whitehouse
TL;DR
The thesis investigates knowledge-grounded natural language understanding and generation with transformer models, examining structured, multilingual, and unstructured knowledge sources. It introduces five papers spanning fake news detection with knowledge-enhanced PLMs, entity-centric code-switching for cross-lingual transfer, faithful information extraction on the web, grounded answer and explanation generation in knowledge-intensive VQA, and LLM-powered data augmentation for multilingual commonsense tasks. Key results show that up-to-date entity knowledge improves fake news detection; multilingual entity knowledge via EntityCS enhances zero-shot transfer across NER, fact retrieval, SLOT filling, and WSD; WebIE provides a robust framework for faithful web information extraction; unified VQA models (UMAE) achieve state-of-the-art explanations and answers with grounded generation; LLM-generated synthetic data substantially boosts performance for smaller multilingual models. Collectively, the work demonstrates the practical benefits of diverse knowledge representations and grounding strategies, and outlines future directions for dynamic retrieval-enabled grounding and mixture-of-experts to maintain up-to-date, faithful NLP systems.
Abstract
This thesis investigates how natural language understanding and generation with transformer models can benefit from grounding the models with knowledge representations and addresses the following key research questions: (i) Can knowledge of entities extend its benefits beyond entity-centric tasks, such as entity linking? (ii) How can we faithfully and effectively extract such structured knowledge from raw text, especially noisy web text? (iii) How do other types of knowledge, beyond structured knowledge, contribute to improving NLP tasks? Studies in this thesis find that incorporating relevant and up-to-date knowledge of entities benefits fake news detection, and entity-focused code-switching significantly enhances zero-shot cross-lingual transfer on entity-centric tasks. In terms of effective and faithful approaches to extracting structured knowledge, it is observed that integrating negative examples and training with entity planning significantly improves performance. Additionally, it is established that other general forms of knowledge, such as parametric and distilled knowledge, enhance multimodal and multilingual knowledge-intensive tasks. This research shows the tangible benefits of diverse knowledge integration and motivates further exploration in this direction.
