Table of Contents
Fetching ...

SETUP: Sentence-level English-To-Uniform Meaning Representation Parser

Emma Markle, Javier Gutierrez Bach, Shira Wein

TL;DR

This work investigates sentence-level English text-to-UMR parsing, a multilingual semantic representation framework extending AMR to cross-linguistic contexts. It compares a baseline AMR-to-UMR pipeline with two fine-tuning strategies: directly adapting AMR parsers on UMR data and bootstrapping partial UMRs from UD trees then completing them with a T5 model. The best-performing approach (SETUP) uses fine-tuned AMR parsers and achieves high graph-similarity scores (AnCast ≈84, SMATCH++ ≈91), while UD-based methods remain competitive with distinct trade-offs, indicating strong potential for cross-lingual UMR parsing and downstream multilingual NLP. The results highlight the feasibility of leveraging existing AMR tools to bootstrap robust English UMR parsers and lay groundwork for extending UMR parsing to low-resource languages and document-level representations in future work.

Abstract

Uniform Meaning Representation (UMR) is a novel graph-based semantic representation which captures the core meaning of a text, with flexibility incorporated into the annotation schema such that the breadth of the world's languages can be annotated (including low-resource languages). While UMR shows promise in enabling language documentation, improving low-resource language technologies, and adding interpretability, the downstream applications of UMR can only be fully explored when text-to-UMR parsers enable the automatic large-scale production of accurate UMR graphs at test time. Prior work on text-to-UMR parsing is limited to date. In this paper, we introduce two methods for English text-to-UMR parsing, one of which fine-tunes existing parsers for Abstract Meaning Representation and the other, which leverages a converter from Universal Dependencies, using prior work as a baseline. Our best-performing model, which we call SETUP, achieves an AnCast score of 84 and a SMATCH++ score of 91, indicating substantial gains towards automatic UMR parsing.

SETUP: Sentence-level English-To-Uniform Meaning Representation Parser

TL;DR

This work investigates sentence-level English text-to-UMR parsing, a multilingual semantic representation framework extending AMR to cross-linguistic contexts. It compares a baseline AMR-to-UMR pipeline with two fine-tuning strategies: directly adapting AMR parsers on UMR data and bootstrapping partial UMRs from UD trees then completing them with a T5 model. The best-performing approach (SETUP) uses fine-tuned AMR parsers and achieves high graph-similarity scores (AnCast ≈84, SMATCH++ ≈91), while UD-based methods remain competitive with distinct trade-offs, indicating strong potential for cross-lingual UMR parsing and downstream multilingual NLP. The results highlight the feasibility of leveraging existing AMR tools to bootstrap robust English UMR parsers and lay groundwork for extending UMR parsing to low-resource languages and document-level representations in future work.

Abstract

Uniform Meaning Representation (UMR) is a novel graph-based semantic representation which captures the core meaning of a text, with flexibility incorporated into the annotation schema such that the breadth of the world's languages can be annotated (including low-resource languages). While UMR shows promise in enabling language documentation, improving low-resource language technologies, and adding interpretability, the downstream applications of UMR can only be fully explored when text-to-UMR parsers enable the automatic large-scale production of accurate UMR graphs at test time. Prior work on text-to-UMR parsing is limited to date. In this paper, we introduce two methods for English text-to-UMR parsing, one of which fine-tunes existing parsers for Abstract Meaning Representation and the other, which leverages a converter from Universal Dependencies, using prior work as a baseline. Our best-performing model, which we call SETUP, achieves an AnCast score of 84 and a SMATCH++ score of 91, indicating substantial gains towards automatic UMR parsing.

Paper Structure

This paper contains 13 sections, 1 figure, 5 tables.

Figures (1)

  • Figure 1: UMR graph for the sentence "They walked on the street" as a graph and in PENMAN notation kasper-1989-flexible.