Table of Contents
Fetching ...

Space of Data through the Lens of Multilevel Graph

Marco Caputo, Michele Russo, Emanuela Merelli

TL;DR

The paper tackles the complexity of dataspaces by introducing a multilevel graph to represent datasets across multiple abstraction layers and to enable traceable contraction/expansion. It provides formal definitions (decontractible graphs, contraction, natural transformation) and the MGDA pipeline that maps raw data through cleaning, normalization, and recursive feature detection into higher-level relational structures suitable for standard graph analytics. Preliminary validation on unstructured dream narratives demonstrates that contraction reduces noise while preserving traceability, yielding informative topological metrics such as assortativity, contraction percentage, and density. The work suggests MGDA as a promising framework for modeling dataspaces and enabling incremental, pay-as-you-go querying, with future directions including application to structured data and richer metadata integration.

Abstract

This work seeks to tackle the inherent complexity of dataspaces by introducing a novel data structure that can represent datasets across multiple levels of abstraction, ranging from local to global. We propose the concept of a multilevel graph, which is equipped with two fundamental operations: contraction and expansion of its topology. This multilevel graph is specifically designed to fulfil the requirements for incremental abstraction and flexibility, as outlined in existing definitions of dataspaces. Furthermore, we provide a comprehensive suite of methods for manipulating this graph structure, establishing a robust framework for data analysis. While its effectiveness has been empirically validated for unstructured data, its application to structured data is also inherently viable. Preliminary results are presented through a real-world scenario based on a collection of dream reports.

Space of Data through the Lens of Multilevel Graph

TL;DR

The paper tackles the complexity of dataspaces by introducing a multilevel graph to represent datasets across multiple abstraction layers and to enable traceable contraction/expansion. It provides formal definitions (decontractible graphs, contraction, natural transformation) and the MGDA pipeline that maps raw data through cleaning, normalization, and recursive feature detection into higher-level relational structures suitable for standard graph analytics. Preliminary validation on unstructured dream narratives demonstrates that contraction reduces noise while preserving traceability, yielding informative topological metrics such as assortativity, contraction percentage, and density. The work suggests MGDA as a promising framework for modeling dataspaces and enabling incremental, pay-as-you-go querying, with future directions including application to structured data and richer metadata integration.

Abstract

This work seeks to tackle the inherent complexity of dataspaces by introducing a novel data structure that can represent datasets across multiple levels of abstraction, ranging from local to global. We propose the concept of a multilevel graph, which is equipped with two fundamental operations: contraction and expansion of its topology. This multilevel graph is specifically designed to fulfil the requirements for incremental abstraction and flexibility, as outlined in existing definitions of dataspaces. Furthermore, we provide a comprehensive suite of methods for manipulating this graph structure, establishing a robust framework for data analysis. While its effectiveness has been empirically validated for unstructured data, its application to structured data is also inherently viable. Preliminary results are presented through a real-world scenario based on a collection of dream reports.

Paper Structure

This paper contains 17 sections, 3 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: A local decontraction in a decontractible graph.
  • Figure 2: Representation of a multilevel graph of height 2.
  • Figure 3: Phases of multilevel graph data analysis on a dream report.
  • Figure 4: Examples of spaces of data arising from simple cycle detection on graphs.
  • Figure 5: Average assortativity coefficients among levels for the $P$ group of dreamers.
  • ...and 6 more figures

Theorems & Definitions (5)

  • definition thmcounterdefinition: Decontractible Graph
  • definition thmcounterdefinition: Contraction of a Decontractible Graph
  • definition thmcounterdefinition: Contraction Function
  • definition thmcounterdefinition: Natural Transformation of a Graph
  • definition thmcounterdefinition: Multilevel Graph