A Compositional Typed Semantics for Universal Dependencies
Laurestine Bradford, Timothy John O'Donnell, Siva Reddy
TL;DR
The paper addresses the challenge of deriving cross-linguistic meanings from varied sentence structures by introducing UD-Type Calculus (UD-TC), a compositional, language-agnostic semantic framework tied to Universal Dependencies. It assigns denotations to words and UD relations as $DRS$-valued functions within a typed lambda calculus, using a hand-crafted lexical inventory to derive multiple potential meanings for UD trees. Evaluated on the Parallel Meaning Bank across English, German, Italian, and Dutch, UD-TC achieves meaning representations competitive with a UD-Boxer baseline, with notable strength in handling discourse relations and quantifier scope through multiple possible forms. The approach leverages existing UD data to enable cross-linguistic semantic interpretation and points to future work in probabilistic lexicon induction and multilingual expansion. Overall, UD-TC provides a scalable, principled pathway from UD syntax to rich, cross-linguistic semantic representations.
Abstract
Languages may encode similar meanings using different sentence structures. This makes it a challenge to provide a single set of formal rules that can derive meanings from sentences in many languages at once. To overcome the challenge, we can take advantage of language-general connections between meaning and syntax, and build on cross-linguistically parallel syntactic structures. We introduce UD Type Calculus, a compositional, principled, and language-independent system of semantic types and logical forms for lexical items which builds on a widely-used language-general dependency syntax framework. We explain the essential features of UD Type Calculus, which all involve giving dependency relations denotations just like those of words. These allow UD-TC to derive correct meanings for sentences with a wide range of syntactic structures by making use of dependency labels. Finally, we present evaluation results on a large existing corpus of sentences and their logical forms, showing that UD-TC can produce meanings comparable with our baseline.
