Table of Contents
Fetching ...

consexpressionR: an R package for consensus differential gene expression analysis

Juliana Costa-Silva, David Menotti, Fabricio M. Lopes

TL;DR

This work tackles the variability in differential expression results across RNA-Seq analysis methods by proposing a consensus-based framework. It introduces consexpressionR, an R package that automates DE analysis by combining seven established methods through a five-stage workflow, with visualization and reporting, validated on two qPCR-validated datasets. The results show that a consensus of four to five methods yields comparable or better accuracy with higher precision than individual methods, while larger consensuses trade recall for precision. The approach enhances the reliability and reproducibility of DEG lists in bulk RNA-Seq and points to future extensions to scRNA-Seq and Bioconductor integration.

Abstract

Motivation: Bulk RNA-Seq is a widely used method for studying gene expression across a variety of contexts. The significance of RNA-Seq studies has grown with the advent of high-throughput sequencing technologies. Computational methods have been developed for each stage of the identification of differentially expressed genes. Nevertheless, there are few studies exploring the association between different types of methods. In this study, we evaluated the impact of the association of methodologies in the results of differential expression analysis. By adopting two data sets with qPCR data (to gold-standard reference), seven methods were implemented and assessed in R packages (EBSeq, edgeR, DESeq2, limma, SAMseq, NOISeq, and Knowseq), which was performed and assessed separately and in association. The results were evaluated considering the adopted qPCR data. Results: Here, we introduce consexpressionR, an R package that automates differential expression analysis using consensus of at least seven methodologies, producing more assertive results with a significant reduction in false positives. Availability: consexpressionR is an R package available via source code and support are available at GitHub (https://github.com/costasilvati/consexpressionR).

consexpressionR: an R package for consensus differential gene expression analysis

TL;DR

This work tackles the variability in differential expression results across RNA-Seq analysis methods by proposing a consensus-based framework. It introduces consexpressionR, an R package that automates DE analysis by combining seven established methods through a five-stage workflow, with visualization and reporting, validated on two qPCR-validated datasets. The results show that a consensus of four to five methods yields comparable or better accuracy with higher precision than individual methods, while larger consensuses trade recall for precision. The approach enhances the reliability and reproducibility of DEG lists in bulk RNA-Seq and points to future extensions to scRNA-Seq and Bioconductor integration.

Abstract

Motivation: Bulk RNA-Seq is a widely used method for studying gene expression across a variety of contexts. The significance of RNA-Seq studies has grown with the advent of high-throughput sequencing technologies. Computational methods have been developed for each stage of the identification of differentially expressed genes. Nevertheless, there are few studies exploring the association between different types of methods. In this study, we evaluated the impact of the association of methodologies in the results of differential expression analysis. By adopting two data sets with qPCR data (to gold-standard reference), seven methods were implemented and assessed in R packages (EBSeq, edgeR, DESeq2, limma, SAMseq, NOISeq, and Knowseq), which was performed and assessed separately and in association. The results were evaluated considering the adopted qPCR data. Results: Here, we introduce consexpressionR, an R package that automates differential expression analysis using consensus of at least seven methodologies, producing more assertive results with a significant reduction in false positives. Availability: consexpressionR is an R package available via source code and support are available at GitHub (https://github.com/costasilvati/consexpressionR).

Paper Structure

This paper contains 6 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: consexpressionR analysis workflow and main functionalities. The workflow comprises four steps, of which only visualization is optional.
  • Figure 2: Precision X Recall curve by dataset (A) by considering the consensus of methods.
  • Figure 3: Precision X Recall curve by dataset (B) by considering the consensus of methods.