Table of Contents
Fetching ...

The Ontoverse: Democratising Access to Knowledge Graph-based Data Through a Cartographic Interface

Johannes Zimmermann, Dariusz Wiktorek, Thomas Meusburger, Miquel Monge-Dalmau, Antonio Fabregat, Alexander Jarasch, Günter Schmidt, Jorge S. Reis-Filho, T. Ian Simpson

TL;DR

A unique approach to data navigation that leans on geographical visualisation and uses hierarchically structured domain knowledge to enable end-users to explore knowledge spaces grounded in their desired domains of interest and facilitates graphical interaction with the underlying knowledge graph.

Abstract

As the number of scientific publications and preprints is growing exponentially, several attempts have been made to navigate this complex and increasingly detailed landscape. These have almost exclusively taken unsupervised approaches that fail to incorporate domain knowledge and lack the structural organisation required for intuitive interactive human exploration and discovery. Especially in highly interdisciplinary fields, a deep understanding of the connectedness of research works across topics is essential for generating insights. We have developed a unique approach to data navigation that leans on geographical visualisation and uses hierarchically structured domain knowledge to enable end-users to explore knowledge spaces grounded in their desired domains of interest. This can take advantage of existing ontologies, proprietary intelligence schemata, or be directly derived from the underlying data through hierarchical topic modelling. Our approach uses natural language processing techniques to extract named entities from the underlying data and normalise them against relevant domain references and navigational structures. The knowledge is integrated by first calculating similarities between entities based on their shared extracted feature space and then by alignment to the navigational structures. The result is a knowledge graph that allows for full text and semantic graph query and structured topic driven navigation. This allows end-users to identify entities relevant to their needs and access extensive graph analytics. The user interface facilitates graphical interaction with the underlying knowledge graph and mimics a cartographic map to maximise ease of use and widen adoption. We demonstrate an exemplar project using our generalisable and scalable infrastructure for an academic biomedical literature corpus that is grounded against hundreds of different named domain entities.

The Ontoverse: Democratising Access to Knowledge Graph-based Data Through a Cartographic Interface

TL;DR

A unique approach to data navigation that leans on geographical visualisation and uses hierarchically structured domain knowledge to enable end-users to explore knowledge spaces grounded in their desired domains of interest and facilitates graphical interaction with the underlying knowledge graph.

Abstract

As the number of scientific publications and preprints is growing exponentially, several attempts have been made to navigate this complex and increasingly detailed landscape. These have almost exclusively taken unsupervised approaches that fail to incorporate domain knowledge and lack the structural organisation required for intuitive interactive human exploration and discovery. Especially in highly interdisciplinary fields, a deep understanding of the connectedness of research works across topics is essential for generating insights. We have developed a unique approach to data navigation that leans on geographical visualisation and uses hierarchically structured domain knowledge to enable end-users to explore knowledge spaces grounded in their desired domains of interest. This can take advantage of existing ontologies, proprietary intelligence schemata, or be directly derived from the underlying data through hierarchical topic modelling. Our approach uses natural language processing techniques to extract named entities from the underlying data and normalise them against relevant domain references and navigational structures. The knowledge is integrated by first calculating similarities between entities based on their shared extracted feature space and then by alignment to the navigational structures. The result is a knowledge graph that allows for full text and semantic graph query and structured topic driven navigation. This allows end-users to identify entities relevant to their needs and access extensive graph analytics. The user interface facilitates graphical interaction with the underlying knowledge graph and mimics a cartographic map to maximise ease of use and widen adoption. We demonstrate an exemplar project using our generalisable and scalable infrastructure for an academic biomedical literature corpus that is grounded against hundreds of different named domain entities.
Paper Structure (17 sections, 7 figures)

This paper contains 17 sections, 7 figures.

Figures (7)

  • Figure 1: Ontoverse: A cartographic user interface nurtured by an ensemble of three Knowledge Graphs, allowing for the seamless navigation of a knowledge landscape.
  • Figure 2: Overview of the core Ontoverse building blocks. The Core Entity Graph (CEG) is comprised of entity nodes, in the use case we are discussing here publications (green circles), and edges (green lines) representing inter-entity similarity (A). The CEG is anchored by alignment with a topic hierarchy graph (THG) that can be manually or dynamically generated by topic modelling in the graph domain space, or derived from preexisting domain ontologies (B). The Entity Annotation Table (EAT) defines links between topics and entities, it can be generated by manual curation or performing named-entity-recognition (NER) on entity meta-data (C). Entities that are associated with more than one topic are represented by entity clones (blue nodes) that are dynamically created based on their pathway relationships within the THG. The entity clone concept allows spatially separate topics whilst maintaining the topology of the graph. We induce topic association (yellow topic tags) to enable logically coherent and seamless navigation guided by the THG (D).
  • Figure 3: Graph Connectivity & Topographic Navigation. Parental entities (green circles) are connected if they are annotated to the same topic (solid green lines) then entity clones (blue circles) are generated based on their mappings to topics either by being directly annotated to the topic (red flags) or by induction (gold flags) the latter topics falling on the transitive path between the parent entity topic and a sub-topic in the THG (A). Matching edges are created to connect all parental entities with their clones (red lines)(B). We next generate all necessary edges between entity clones to satisfy the parental mappings, for example if E1$\leftrightarrow$E4 then {E1c1$\leftrightarrow$E4, E1c2$\leftrightarrow$E4, ...}. Edges are expanded to incorporate the connectivity of entity clones with "within-topic" (solid green lines) or "between-topic" (dashed green lines) edges to allow exploration of their proximal and distal topic environment (C). Graph topology is then re-organised using a nested radial algorithm to arrange topics and their sub-topics proximal to each other (D). We visualise the graph as a cartographic map with contour lines and colouration reflecting entity density; densely packed locations as hills, sparse areas as valleys and the "sea" separating areas that are poorly connected (E). We can view the map at different levels of granularity by selecting the THG depth (F). All edges are weighted based on the calculated similarity score between the corresponding parental entities.
  • Figure 4: Sequence Diagram of the Ontoverse flow within the layers of a technical stack.
  • Figure 5: Spatial layout of entities in the graph as dictated by circular packing.
  • ...and 2 more figures