Table of Contents
Fetching ...

Pennsieve: A Collaborative Platform for Translational Neuroscience and Beyond

Zack Goldblum, Zhongchuan Xu, Haoer Shi, Patryk Orzechowski, Jamaal Spence, Kathryn A Davis, Brian Litt, Nishant Sinha, Joost Wagenaar

TL;DR

Pennsieve is an open-source, cloud-based scientific data management platform built to meet the needs of the exponential growth of neuroscientific data, and adheres to the findable, accessible, interoperable, and reusable (FAIR) principles of data sharing.

Abstract

The exponential growth of neuroscientific data necessitates platforms that facilitate data management and multidisciplinary collaboration. In this paper, we introduce Pennsieve - an open-source, cloud-based scientific data management platform built to meet these needs. Pennsieve supports complex multimodal datasets and provides tools for data visualization and analyses. It takes a comprehensive approach to data integration, enabling researchers to define custom metadata schemas and utilize advanced tools to filter and query their data. Pennsieve's modular architecture allows external applications to extend its capabilities, and collaborative workspaces with peer-reviewed data publishing mechanisms promote high-quality datasets optimized for downstream analysis, both in the cloud and on-premises. Pennsieve forms the core for major neuroscience research programs including NIH SPARC Initiative, NIH HEAL Initiative's PRECISION Human Pain Network, and NIH HEAL RE-JOIN Initiative. It serves more than 80 research groups worldwide, along with several large-scale, inter-institutional projects at clinical sites through the University of Pennsylvania. Underpinning the SPARC.Science, Epilepsy.Science, and Pennsieve Discover portals, Pennsieve stores over 125 TB of scientific data, with 35 TB of data publicly available across more than 350 high-impact datasets. It adheres to the findable, accessible, interoperable, and reusable (FAIR) principles of data sharing and is recognized as one of the NIH-approved Data Repositories. By facilitating scientific data management, discovery, and analysis, Pennsieve fosters a robust and collaborative research ecosystem for neuroscience and beyond.

Pennsieve: A Collaborative Platform for Translational Neuroscience and Beyond

TL;DR

Pennsieve is an open-source, cloud-based scientific data management platform built to meet the needs of the exponential growth of neuroscientific data, and adheres to the findable, accessible, interoperable, and reusable (FAIR) principles of data sharing.

Abstract

The exponential growth of neuroscientific data necessitates platforms that facilitate data management and multidisciplinary collaboration. In this paper, we introduce Pennsieve - an open-source, cloud-based scientific data management platform built to meet these needs. Pennsieve supports complex multimodal datasets and provides tools for data visualization and analyses. It takes a comprehensive approach to data integration, enabling researchers to define custom metadata schemas and utilize advanced tools to filter and query their data. Pennsieve's modular architecture allows external applications to extend its capabilities, and collaborative workspaces with peer-reviewed data publishing mechanisms promote high-quality datasets optimized for downstream analysis, both in the cloud and on-premises. Pennsieve forms the core for major neuroscience research programs including NIH SPARC Initiative, NIH HEAL Initiative's PRECISION Human Pain Network, and NIH HEAL RE-JOIN Initiative. It serves more than 80 research groups worldwide, along with several large-scale, inter-institutional projects at clinical sites through the University of Pennsylvania. Underpinning the SPARC.Science, Epilepsy.Science, and Pennsieve Discover portals, Pennsieve stores over 125 TB of scientific data, with 35 TB of data publicly available across more than 350 high-impact datasets. It adheres to the findable, accessible, interoperable, and reusable (FAIR) principles of data sharing and is recognized as one of the NIH-approved Data Repositories. By facilitating scientific data management, discovery, and analysis, Pennsieve fosters a robust and collaborative research ecosystem for neuroscience and beyond.
Paper Structure (49 sections, 12 figures, 7 tables)

This paper contains 49 sections, 12 figures, 7 tables.

Figures (12)

  • Figure 1: The Pennsieve platform serves as a data management infrastructure for the global scientific community. It is built around shared workspaces, data repositories, scalable analytics, and integrations that facilitate collaborative research at scale. FAIR: Findable, Accessible, Interoperable, and Reusable.
  • Figure 2: A technical overview of the Pennsieve platform. Its multi-tenant architecture supports multiple consortia through dedicated web applications and well-documented APIs.
  • Figure 3: The Pennsieve platform's web application interface. This workspace view highlights key features including dataset management, collaborative functionality, publishing workflows, and analysis tools.
  • Figure 4: Pennsieve platform metrics and file type distribution. The circular chart illustrates the variety of file types supported, with comparative proportions for all datasets and public datasets.
  • Figure A.1: Pennsieve is utilized by more than 1,700 users across 80+ research sites worldwide.
  • ...and 7 more figures