Table of Contents
Fetching ...

NACSOS-nexus: NLP Assisted Classification, Synthesis and Online Screening with New and EXtended Usage Scenarios

Tim Repke, Max Callaghan

TL;DR

The paper presents NACSOS-nexus, an open-source rearchitecture of the NACSOS platform designed to support scalable, living evidence syntheses. It integrates multi-source data ingestion (including an OpenAlex snapshot), robust deduplication, ML-assisted prioritised screening with statistical stopping criteria, and a flexible, assignment-based annotation workflow. A dedicated query language (NQL), a literature hub, and reusable analytic pipelines enable transparent, reproducible maps and living reviews across research domains. The approach emphasizes data provenance, user-centric annotation management, and an extensible ecosystem for classifier tools and living maps, with broad potential impact on systematic maps and reviews.

Abstract

NACSOS is a web-based platform for curating data used in systematic maps. It contains several (experimental) features that aid the evidence synthesis process from finding and ingesting primary data (mainly scientific publications), basic search and exploration thereof, but mainly the handling of managing the manual and automated annotations. The platform supports prioritised screening algorithms and is the first to fully implement statistical stopping criteria. Annotations by multiple coders can be resolved and customisable quality metrics are computed on-the-fly. In its current state, the annotations are performed on document level. The ecosystem around NACSOS offers packages for accessing the underlying database and practical utility functions that have proven useful in a multitude of projects. Further, it provides the backbone of living maps, review ecosystems, and our public literature hub for sharing high-quality curated corpora.

NACSOS-nexus: NLP Assisted Classification, Synthesis and Online Screening with New and EXtended Usage Scenarios

TL;DR

The paper presents NACSOS-nexus, an open-source rearchitecture of the NACSOS platform designed to support scalable, living evidence syntheses. It integrates multi-source data ingestion (including an OpenAlex snapshot), robust deduplication, ML-assisted prioritised screening with statistical stopping criteria, and a flexible, assignment-based annotation workflow. A dedicated query language (NQL), a literature hub, and reusable analytic pipelines enable transparent, reproducible maps and living reviews across research domains. The approach emphasizes data provenance, user-centric annotation management, and an extensible ecosystem for classifier tools and living maps, with broad potential impact on systematic maps and reviews.

Abstract

NACSOS is a web-based platform for curating data used in systematic maps. It contains several (experimental) features that aid the evidence synthesis process from finding and ingesting primary data (mainly scientific publications), basic search and exploration thereof, but mainly the handling of managing the manual and automated annotations. The platform supports prioritised screening algorithms and is the first to fully implement statistical stopping criteria. Annotations by multiple coders can be resolved and customisable quality metrics are computed on-the-fly. In its current state, the annotations are performed on document level. The ecosystem around NACSOS offers packages for accessing the underlying database and practical utility functions that have proven useful in a multitude of projects. Further, it provides the backbone of living maps, review ecosystems, and our public literature hub for sharing high-quality curated corpora.
Paper Structure (15 sections, 1 equation, 2 figures)

This paper contains 15 sections, 1 equation, 2 figures.

Figures (2)

  • Figure 1: Item annotation view; document is shown in the middle, the annotation scheme on the right. This example shows a nested scheme (some labels only show up when another condition is met) with several single- and multi-choice labels.
  • Figure 2: Label resolution view. Rows contain annotations for this item, columns are (nested) labels. Rectangle boxes are each annotator's labels, rounded boxes are resolved labels. Hovering any field reveals additional details (such as human-readable label or value, name of annotators, or document information).