Table of Contents
Fetching ...

Benchmarking 80 binary phenotypes from the openSNP dataset using deep learning algorithms and polygenic risk score tools

Muhammad Muneeb, David B. Ascher, YooChan Myung, Samuel F. Feng, Andreas Henschel

TL;DR

This manuscript benchmark the performance of various machine/deep learning algorithms and polygenic risk score tools on 80 binary phenotypes extracted from the openSNP dataset to give valuable insights into which techniques tend to perform better for certain phenotypes compared to more traditional polygenic risk scores tools.

Abstract

Genotype-phenotype prediction plays a crucial role in identifying disease-causing single nucleotide polymorphisms and precision medicine. In this manuscript, we benchmark the performance of various machine/deep learning algorithms and polygenic risk score tools on 80 binary phenotypes extracted from the openSNP dataset. After cleaning and extraction, the genotype data for each phenotype is passed to PLINK for quality control, after which it is transformed separately for each of the considered tools/algorithms. To compute polygenic risk scores, we used the quality control measures for the test data and the genome-wide association studies summary statistic file, along with various combinations of clumping and pruning. For the machine learning algorithms, we used p-value thresholding on the training data to select the single nucleotide polymorphisms, and the resulting data was passed to the algorithm. Our results report the average 5-fold Area Under the Curve (AUC) for 29 machine learning algorithms, 80 deep learning algorithms, and 3 polygenic risk scores tools with 675 different clumping and pruning parameters. Machine learning outperformed for 44 phenotypes, while polygenic risk score tools excelled for 36 phenotypes. The results give us valuable insights into which techniques tend to perform better for certain phenotypes compared to more traditional polygenic risk scores tools.

Benchmarking 80 binary phenotypes from the openSNP dataset using deep learning algorithms and polygenic risk score tools

TL;DR

This manuscript benchmark the performance of various machine/deep learning algorithms and polygenic risk score tools on 80 binary phenotypes extracted from the openSNP dataset to give valuable insights into which techniques tend to perform better for certain phenotypes compared to more traditional polygenic risk scores tools.

Abstract

Genotype-phenotype prediction plays a crucial role in identifying disease-causing single nucleotide polymorphisms and precision medicine. In this manuscript, we benchmark the performance of various machine/deep learning algorithms and polygenic risk score tools on 80 binary phenotypes extracted from the openSNP dataset. After cleaning and extraction, the genotype data for each phenotype is passed to PLINK for quality control, after which it is transformed separately for each of the considered tools/algorithms. To compute polygenic risk scores, we used the quality control measures for the test data and the genome-wide association studies summary statistic file, along with various combinations of clumping and pruning. For the machine learning algorithms, we used p-value thresholding on the training data to select the single nucleotide polymorphisms, and the resulting data was passed to the algorithm. Our results report the average 5-fold Area Under the Curve (AUC) for 29 machine learning algorithms, 80 deep learning algorithms, and 3 polygenic risk scores tools with 675 different clumping and pruning parameters. Machine learning outperformed for 44 phenotypes, while polygenic risk score tools excelled for 36 phenotypes. The results give us valuable insights into which techniques tend to perform better for certain phenotypes compared to more traditional polygenic risk scores tools.
Paper Structure (19 sections, 4 equations, 5 figures, 6 tables)

This paper contains 19 sections, 4 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: A workflow of genotype-phenotype prediction using ML/DL and PRS. A case/control classification flowchart using ML/DL and PRS tools. First, clean phenotype data and extract binary phenotypes from the openSNP dataset. Second, merge the genotype data for each phenotype, convert the dataset to Plink, and perform quality controls on genotype data. Split data in train/test and the processing is done on each fold. Third, generate multiple sub-datasets using p-value thresholds on the GWAS file, and pass the sub-datasets to ML/DL models. Fourth, generate the GWAS from the training data and perform quality controls on both test and GWAS files. After that, the processed GWAS and test are passed to PRS tools for processing.
  • Figure 2: This diagram shows the AUC for each phenotype obtained from the ML/DL algorithms and group phenotypes on the number of SNPs that yield the best results.
  • Figure 3: This heatmap shows the AUC as a percentage on the test data yielded by the ML/DL and PRS tools for each phenotype.
  • Figure 4: This diagram shows the best AUC for each phenotype and group phenotypes based on the tools that yield the best AUC for a specific phenotype.
  • Figure 5: The x-axis shows a phenotype. The labels on the y-axis show the tool and hyperparameter that yielded the best AUC for a specific phenotype. D, L, M, P, and PR on y-axis mean Deep learning, Lassosum, Machine learning, Plink, and PRSice, respectively. It also shows the count of phenotypes for which this combination of tool and hyperparameter generated the best AUC. Each cell reports AUC in percent for a phenotype. Cells for which the value is 0 mean that the tool and hyperparameter combination did not yield the best AUC.