Talking to Machines: do you read me?
Lina M. Rojas-Barahona
TL;DR
This work surveys the evolution of dialogue systems from modular, task-specific architectures to end-to-end neural approaches and large language models, emphasizing Task-Oriented Dialogues (TOD) and Conversational Question Answering (CQA). It presents a coherent program of contributions across NLU, Dialogue Management, and NLG for TOD, including data augmentation, few-shot learning, Bayesian IRL, imitation learning, and graph-based policies. It also extends to conversational QA with ellipsis/coreference detection, question rewriting, and a Wikidata-grounded corpus, alongside advanced KG embeddings in hyperbolic space. The dissertation further documents data collection, annotation practices, and dialogue frameworks (notably PyDial and Dialport), culminating in a scientific project on LLMs for TOD and multimodal dialogue with regard to evaluation, grounding, and decoding control. Overall, the work advances both methodological foundations and practical resources for robust, scalable dialogue systems and foundational QA over knowledge graphs.
Abstract
In this dissertation I would like to guide the reader to the research on dialogue but more precisely the research I have conducted during my career since my PhD thesis. Starting from modular architectures with machine learning/deep learning and reinforcement learning to end-to-end deep neural networks. Besides my work as research associate, I also present the work I have supervised in the last years. I review briefly the state of the art and highlight the open research problems on conversational agents. Afterwards, I present my contribution to Task-Oriented Dialogues (TOD), both as research associate and as the industrial supervisor of CIFRE theses. I discuss conversational QA. Particularly, I present the work of two PhD candidates Thibault Cordier and Sebastien Montella; as well as the work of the young researcher Quentin Brabant. Finally, I present the scientific project, where I discuss about Large Language Models (LLMs) for Task-Oriented Dialogue and Multimodal Task-Oriented Dialogue.
