Table of Contents
Fetching ...

Differential gene expression analysis via two-component mixture models with a semiparametric skew-normal scale mixture alternative

Sangkon Oh, Geoffrey J. McLachlan

TL;DR

A semiparametric mixture model is proposed in which the null component is standard normal and the alternative follows a skew-normal scale mixture with an unspecified scale mixing distribution, providing a flexible and computationally tractable tool for differential gene-expression analysis without restrictive distributional assumptions.

Abstract

Two-component mixture models are particularly useful for identifying differentially expressed genes, but their performance can deteriorate markedly when the alternative distribution departs from parametric assumptions or symmetry. We propose a semiparametric mixture model in which the null component is standard normal and the alternative follows a skew-normal scale mixture with an unspecified scale mixing distribution. This formulation accommodates skewness and heavy tails, providing a flexible and computationally tractable tool for differential gene-expression analysis without restrictive distributional assumptions. We establish identifiability and consistency of the model and develop an efficient estimation algorithm that incorporates nonparametric maximum likelihood estimation of the scale distribution. Numerical studies show notable improvements over existing parametric and nonparametric approaches for modeling the alternative distribution, and applications to colon cancer and leukemia datasets demonstrate reduced false discovery and false negative rates.

Differential gene expression analysis via two-component mixture models with a semiparametric skew-normal scale mixture alternative

TL;DR

A semiparametric mixture model is proposed in which the null component is standard normal and the alternative follows a skew-normal scale mixture with an unspecified scale mixing distribution, providing a flexible and computationally tractable tool for differential gene-expression analysis without restrictive distributional assumptions.

Abstract

Two-component mixture models are particularly useful for identifying differentially expressed genes, but their performance can deteriorate markedly when the alternative distribution departs from parametric assumptions or symmetry. We propose a semiparametric mixture model in which the null component is standard normal and the alternative follows a skew-normal scale mixture with an unspecified scale mixing distribution. This formulation accommodates skewness and heavy tails, providing a flexible and computationally tractable tool for differential gene-expression analysis without restrictive distributional assumptions. We establish identifiability and consistency of the model and develop an efficient estimation algorithm that incorporates nonparametric maximum likelihood estimation of the scale distribution. Numerical studies show notable improvements over existing parametric and nonparametric approaches for modeling the alternative distribution, and applications to colon cancer and leukemia datasets demonstrate reduced false discovery and false negative rates.
Paper Structure (16 sections, 2 theorems, 50 equations, 10 figures, 2 tables)

This paper contains 16 sections, 2 theorems, 50 equations, 10 figures, 2 tables.

Key Result

Theorem 1

Suppose for almost all $z \in \mathbb{R}$. Then, it follows that

Figures (10)

  • Figure 1: Estimated density from each method with one simulated sample of Case $I$ when $N = 5000$
  • Figure 2: Estimated density from each method with one simulated sample of Case $II$ when $N = 5000$
  • Figure 3: Estimated density from each method with one simulated sample of Case $III$ when $N = 5000$
  • Figure 4: Estimated density from each method with one simulated sample of Case $IV$ when $N = 5000$
  • Figure 5: Estimated density from each method with one simulated sample of Case $V$ when $N = 5000$
  • ...and 5 more figures

Theorems & Definitions (6)

  • Theorem 1
  • proof
  • Remark 1
  • Remark 2
  • Theorem 2
  • proof