MACT: Model-Agnostic Cross-Lingual Training for Discourse Representation Structure Parsing
Jiangming Liu
TL;DR
This work tackles cross-lingual discourse representation structure (DRS) parsing, addressing the limitation of monolingual training by introducing a model-agnostic cross-lingual training approach that leverages multilingual data via pre-trained language models. It frames DRS parsing as conditional generation and employs parameter-efficient fine-tuning with LoRA on a large multilingual model (mT0-large), achieving state-of-the-art results on PMB 3.0.0 and 4.0.0 across English, German, Italian, and Dutch for both clause and graph forms, while offering detailed analyses of well-formedness and error patterns. The method enables effective cross-language generalization without language-identification in training data and demonstrates the practicality of combining cross-lingual training with language-specific fine-tuning to maximize performance, scalability, and data efficiency. This approach holds promise for multilingual semantic parsing tasks beyond DRS, particularly where data are unevenly available across languages and model efficiency is essential.
Abstract
Discourse Representation Structure (DRS) is an innovative semantic representation designed to capture the meaning of texts with arbitrary lengths across languages. The semantic representation parsing is essential for achieving natural language understanding through logical forms. Nevertheless, the performance of DRS parsing models remains constrained when trained exclusively on monolingual data. To tackle this issue, we introduce a cross-lingual training strategy. The proposed method is model-agnostic yet highly effective. It leverages cross-lingual training data and fully exploits the alignments between languages encoded in pre-trained language models. The experiments conducted on the standard benchmarks demonstrate that models trained using the cross-lingual training method exhibit significant improvements in DRS clause and graph parsing in English, German, Italian and Dutch. Comparing our final models to previous works, we achieve state-of-the-art results in the standard benchmarks. Furthermore, the detailed analysis provides deep insights into the performance of the parsers, offering inspiration for future research in DRS parsing. We keep updating new results on benchmarks to the appendix.
