Table of Contents
Fetching ...

A modular and scalable web platform for computational phylogenetics

Nyckollas Brandão, André Jesus, André Páscoa, Alexandre P. Francisco, Mário Ramirez, Cátia Vaz

TL;DR

The paper addresses the challenge of scaling phylogenetic analyses and integrating epidemiological data by presenting a modular cloud-based platform. It introduces the PHYLOViZ Web Platform, a cloud-native, workflow-driven system that unifies prior PHYLOViZ features and supports structured, reusable analyses via containerized tools and a data-centric workflow model. The architecture combines a React frontend, Spring Boot microservices, a REST API gateway, and data stores such as PhyloDB with metadata in MongoDB, enabling multi-user, multi-dataset project management and scalable visualizations. The platform emphasizes reproducibility by persisting workflow specifications, tool configurations, and results, and points to future work on advanced tree visualization and additional analysis methods.

Abstract

Phylogenetic analysis, which allow to understand the evolution of bacterial and viral epidemics, requires large quantities of data to be analysed and processed for knowledge extraction. One of the major challenges consists on the integration of the results from typing and phylogenetic inference methods with epidemiological data, namely in what concerns their integrated and simultaneous analysis and visualization. Numerous approaches to support phylogenetic analysis have been proposed, varying from standalone tools to integrative web applications that include tools and/or algorithms for executing the common analysis tasks for this kind of data. However, most of them lack the capacity to integrate epidemiological data. Others provide the ability for visualizing and analyzing such data, allowing the integration of epidemiological data but they do not scale for large data analysis and visualization. Namely, most of them run inference and/or visualization optimization tasks on the client side, which becomes often unfeasible for large amounts of data, usually implying transferring data from existing databases in order to be analysed. Moreover, the results and optimizations are not stored for reuse. We propose the PHYLOViZ Web Platform, a cloud based tool for phylogenetic analysis, that not only unifies the features of both existing versions of PHYLOViZ, but also supports structured and customized workflows for executing data processing and analyses tasks, and promotes the reproducibility of previous phylogenetic analyses. This platform supports large scale analyses by relying on a workflow system that enables the distribution of parallel computations on cloud and HPC environments. Moreover, it has a modular architecture, allowing easy integration of new methods and tools, as well as customized workflows, making it flexible and extensible.

A modular and scalable web platform for computational phylogenetics

TL;DR

The paper addresses the challenge of scaling phylogenetic analyses and integrating epidemiological data by presenting a modular cloud-based platform. It introduces the PHYLOViZ Web Platform, a cloud-native, workflow-driven system that unifies prior PHYLOViZ features and supports structured, reusable analyses via containerized tools and a data-centric workflow model. The architecture combines a React frontend, Spring Boot microservices, a REST API gateway, and data stores such as PhyloDB with metadata in MongoDB, enabling multi-user, multi-dataset project management and scalable visualizations. The platform emphasizes reproducibility by persisting workflow specifications, tool configurations, and results, and points to future work on advanced tree visualization and additional analysis methods.

Abstract

Phylogenetic analysis, which allow to understand the evolution of bacterial and viral epidemics, requires large quantities of data to be analysed and processed for knowledge extraction. One of the major challenges consists on the integration of the results from typing and phylogenetic inference methods with epidemiological data, namely in what concerns their integrated and simultaneous analysis and visualization. Numerous approaches to support phylogenetic analysis have been proposed, varying from standalone tools to integrative web applications that include tools and/or algorithms for executing the common analysis tasks for this kind of data. However, most of them lack the capacity to integrate epidemiological data. Others provide the ability for visualizing and analyzing such data, allowing the integration of epidemiological data but they do not scale for large data analysis and visualization. Namely, most of them run inference and/or visualization optimization tasks on the client side, which becomes often unfeasible for large amounts of data, usually implying transferring data from existing databases in order to be analysed. Moreover, the results and optimizations are not stored for reuse. We propose the PHYLOViZ Web Platform, a cloud based tool for phylogenetic analysis, that not only unifies the features of both existing versions of PHYLOViZ, but also supports structured and customized workflows for executing data processing and analyses tasks, and promotes the reproducibility of previous phylogenetic analyses. This platform supports large scale analyses by relying on a workflow system that enables the distribution of parallel computations on cloud and HPC environments. Moreover, it has a modular architecture, allowing easy integration of new methods and tools, as well as customized workflows, making it flexible and extensible.
Paper Structure (5 sections, 5 figures, 1 table)

This paper contains 5 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: PHYLOViZ Web Platform architecture.
  • Figure 2: Compute microservice architecture.
  • Figure 3: PHYLOViZ Web Plataform use cases.
  • Figure 4: UI flow diagram.
  • Figure 5: Tree visualization details.