Table of Contents
Fetching ...

Generating Text from Uniform Meaning Representation

Emma Markle, Reihaneh Iranmanesh, Shira Wein

TL;DR

The paper addresses generating fluent text from Uniform Meaning Representation (UMR), a multilingual graph-based semantic representation. It proposes three strategies that leverage Abstract Meaning Representation (AMR) technologies: baseline AMR-to-text generation on UMR, a pipeline converting UMR to AMR before generation, and fine-tuning AMR-based and foundation models directly on UMR data. Empirical results show fine-tuned AMR-to-text approaches yield the strongest multilingual performance, with English achieving an mBERT score of 0.825 and Chinese 0.882 under certain configurations, and document-level information providing gains even for single-sentence outputs. The work highlights data limitations, especially for Indigenous languages, and demonstrates the value of AMR-informed fine-tuning for enabling an initial UMR-to-text ecosystem across languages.

Abstract

Uniform Meaning Representation (UMR) is a recently developed graph-based semantic representation, which expands on Abstract Meaning Representation (AMR) in a number of ways, in particular through the inclusion of document-level information and multilingual flexibility. In order to effectively adopt and leverage UMR for downstream tasks, efforts must be placed toward developing a UMR technological ecosystem. Though only a small amount of UMR annotations have been produced to date, in this work, we investigate the first approaches to producing text from multilingual UMR graphs. Exploiting the structural similarity between UMR and AMR graphs and the wide availability of AMR technologies, we introduce (1) a baseline approach which passes UMR graphs to AMR-to-text generation models, (2) a pipeline conversion of UMR to AMR, then using AMR-to-text generation models, and (3) a fine-tuning approach for both foundation models and AMR-to-text generation models with UMR data. Our best performing models achieve multilingual BERTscores of 0.825 for English and 0.882 for Chinese, a promising indication of the effectiveness of fine-tuning approaches for UMR-to-text generation even with limited UMR data.

Generating Text from Uniform Meaning Representation

TL;DR

The paper addresses generating fluent text from Uniform Meaning Representation (UMR), a multilingual graph-based semantic representation. It proposes three strategies that leverage Abstract Meaning Representation (AMR) technologies: baseline AMR-to-text generation on UMR, a pipeline converting UMR to AMR before generation, and fine-tuning AMR-based and foundation models directly on UMR data. Empirical results show fine-tuned AMR-to-text approaches yield the strongest multilingual performance, with English achieving an mBERT score of 0.825 and Chinese 0.882 under certain configurations, and document-level information providing gains even for single-sentence outputs. The work highlights data limitations, especially for Indigenous languages, and demonstrates the value of AMR-informed fine-tuning for enabling an initial UMR-to-text ecosystem across languages.

Abstract

Uniform Meaning Representation (UMR) is a recently developed graph-based semantic representation, which expands on Abstract Meaning Representation (AMR) in a number of ways, in particular through the inclusion of document-level information and multilingual flexibility. In order to effectively adopt and leverage UMR for downstream tasks, efforts must be placed toward developing a UMR technological ecosystem. Though only a small amount of UMR annotations have been produced to date, in this work, we investigate the first approaches to producing text from multilingual UMR graphs. Exploiting the structural similarity between UMR and AMR graphs and the wide availability of AMR technologies, we introduce (1) a baseline approach which passes UMR graphs to AMR-to-text generation models, (2) a pipeline conversion of UMR to AMR, then using AMR-to-text generation models, and (3) a fine-tuning approach for both foundation models and AMR-to-text generation models with UMR data. Our best performing models achieve multilingual BERTscores of 0.825 for English and 0.882 for Chinese, a promising indication of the effectiveness of fine-tuning approaches for UMR-to-text generation even with limited UMR data.

Paper Structure

This paper contains 20 sections, 4 figures, 12 tables.

Figures (4)

  • Figure 1: UMR graph for the sentence "He was searching for a clue" in graph form (top) and in 'PENMAN' notation kasper-1989-flexible (bottom).
  • Figure 2: AMR and UMR graphs for the sentence "Pleasure," compared with the AMR graph produced by our UMR-to-AMR pipeline.
  • Figure 3: English fluency instructions
  • Figure 4: English adequacy instructions