Table of Contents
Fetching ...

Survey on Semantic Interpretation of Tabular Data: Challenges and Directions

Marco Cremaschi, Blerina Spahiu, Matteo Palmonari, Ernesto Jimenez-Ruiz

TL;DR

This survey aims to provide a comprehensive overview of the STI landscape by categorizing approaches using a taxonomy of 31 attributes, allowing for comparisons and evaluations, and offering an in-depth analysis of the Gold Standards used for evaluating STI approaches.

Abstract

Tabular data plays a pivotal role in various fields, making it a popular format for data manipulation and exchange, particularly on the web. The interpretation, extraction, and processing of tabular information are invaluable for knowledge-intensive applications. Notably, significant efforts have been invested in annotating tabular data with ontologies and entities from background knowledge graphs, a process known as Semantic Table Interpretation (STI). STI automation aids in building knowledge graphs, enriching data, and enhancing web-based question answering. This survey aims to provide a comprehensive overview of the STI landscape. It starts by categorizing approaches using a taxonomy of 31 attributes, allowing for comparisons and evaluations. It also examines available tools, assessing them based on 12 criteria. Furthermore, the survey offers an in-depth analysis of the Gold Standards used for evaluating STI approaches. Finally, it provides practical guidance to help end-users choose the most suitable approach for their specific tasks while also discussing unresolved issues and suggesting potential future research directions.

Survey on Semantic Interpretation of Tabular Data: Challenges and Directions

TL;DR

This survey aims to provide a comprehensive overview of the STI landscape by categorizing approaches using a taxonomy of 31 attributes, allowing for comparisons and evaluations, and offering an in-depth analysis of the Gold Standards used for evaluating STI approaches.

Abstract

Tabular data plays a pivotal role in various fields, making it a popular format for data manipulation and exchange, particularly on the web. The interpretation, extraction, and processing of tabular information are invaluable for knowledge-intensive applications. Notably, significant efforts have been invested in annotating tabular data with ontologies and entities from background knowledge graphs, a process known as Semantic Table Interpretation (STI). STI automation aids in building knowledge graphs, enriching data, and enhancing web-based question answering. This survey aims to provide a comprehensive overview of the STI landscape. It starts by categorizing approaches using a taxonomy of 31 attributes, allowing for comparisons and evaluations. It also examines available tools, assessing them based on 12 criteria. Furthermore, the survey offers an in-depth analysis of the Gold Standards used for evaluating STI approaches. Finally, it provides practical guidance to help end-users choose the most suitable approach for their specific tasks while also discussing unresolved issues and suggesting potential future research directions.

Paper Structure

This paper contains 50 sections, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Portion of the diary of Merer (around 2600 BC), an official in charge of a team of workers responsible for transporting limestone blocks from Tura to Giza to construct the Great Pyramid. The document details various aspects of the logistics involved in the transportation process, such as the organisation of labour, the use of boats to navigate the Nile River, and the daily activities of the workers.
  • Figure 2: Example of a well-formed relational table.
  • Figure 3: A sample of kg.
  • Figure 4: Example of an annotated table.
  • Figure 5: Examples of applications supported by STI.
  • ...and 5 more figures