Table of Contents
Fetching ...

EventNet-ITA: Italian Frame Parsing for Events

Marco Rovera

TL;DR

EventNet-ITA introduces EvN-ITA, a large-scale Italian corpus annotated with over 200 event frames and 3,571 frame elements across more than 53k sentences, accompanied by a transformer-based multi-label sequence labeling approach for frame parsing. The study demonstrates high-precision frame and frame-element identification (F1 of $0.90$ and $0.724$, respectively) and emphasizes an efficient, end-to-end learning paradigm that minimizes error propagation. It also provides thorough annotation guidelines, inter-annotator agreement metrics, and release plans to support reproducibility and downstream tasks in historical and social-domain NLP. The work significantly advances Italian frame semantics by delivering a publicly available resource and a readily usable model, enabling robust event extraction across diverse domains.

Abstract

This paper introduces EventNet-ITA, a large, multi-domain corpus annotated full-text with event frames for Italian. Moreover, we present and thoroughly evaluate an efficient multi-label sequence labeling approach for Frame Parsing. Covering a wide range of individual, social and historical phenomena, with more than 53,000 annotated sentences and over 200 modeled frames, EventNet-ITA constitutes the first systematic attempt to provide the Italian language with a publicly available resource for Frame Parsing of events, useful for a broad spectrum of research and application tasks. Our approach achieves a promising 0.9 strict F1-score for frame classification and 0.72 for frame element classification, on top of minimizing computational requirements. The annotated corpus and the frame parsing model are released under open license.

EventNet-ITA: Italian Frame Parsing for Events

TL;DR

EventNet-ITA introduces EvN-ITA, a large-scale Italian corpus annotated with over 200 event frames and 3,571 frame elements across more than 53k sentences, accompanied by a transformer-based multi-label sequence labeling approach for frame parsing. The study demonstrates high-precision frame and frame-element identification (F1 of and , respectively) and emphasizes an efficient, end-to-end learning paradigm that minimizes error propagation. It also provides thorough annotation guidelines, inter-annotator agreement metrics, and release plans to support reproducibility and downstream tasks in historical and social-domain NLP. The work significantly advances Italian frame semantics by delivering a publicly available resource and a readily usable model, enabling robust event extraction across diverse domains.

Abstract

This paper introduces EventNet-ITA, a large, multi-domain corpus annotated full-text with event frames for Italian. Moreover, we present and thoroughly evaluate an efficient multi-label sequence labeling approach for Frame Parsing. Covering a wide range of individual, social and historical phenomena, with more than 53,000 annotated sentences and over 200 modeled frames, EventNet-ITA constitutes the first systematic attempt to provide the Italian language with a publicly available resource for Frame Parsing of events, useful for a broad spectrum of research and application tasks. Our approach achieves a promising 0.9 strict F1-score for frame classification and 0.72 for frame element classification, on top of minimizing computational requirements. The annotated corpus and the frame parsing model are released under open license.
Paper Structure (21 sections, 2 figures, 11 tables)

This paper contains 21 sections, 2 figures, 11 tables.

Figures (2)

  • Figure 1: Macro-topics covered in EvN-ITA (in brackets, the number of frames belonging to each domain).
  • Figure 2: An example of full-text annotation in EvN-ITA (English translation: The construction of the Alvitian fortification dates back to the time of the Norman invasion.).