Table of Contents
Fetching ...

ResGene-T: A Tensor-Based Residual Network Approach for Genomic Prediction

Kuldeep Pathak, Kapil Ahuja, Eric de Sturler

TL;DR

This work proposes a new deep learning model for Genomic Prediction (GP), which involves correlating genotypic data with phenotypic traits and proposes a novel idea of converting the 2D-image into a 3D/ tensor and feed this to the ResNet-18 architecture, and term this model as ResGene-T.

Abstract

In this work, we propose a new deep learning model for Genomic Prediction (GP), which involves correlating genotypic data with phenotypic. The genotypes are typically fed as a sequence of characters to the 1D-Convolution Neural Network layer of the underlying deep learning model. Inspired by earlier work that represented genotype as a 2D-image for genotype-phenotype classification, we extend this idea to GP, which is a regression task. We use a ResNet-18 as the underlying architecture, and term this model as ResGene-2D. Although the 2D-image representation captures biological interactions well, it requires all the layers of the model to do so. This limits training efficiency. Thus, as seen in the earlier work that proposed a 2D-image representation, our ResGene-2D performs almost the same as other models (3% improvement). To overcome this, we propose a novel idea of converting the 2D-image into a 3D/ tensor and feed this to the ResNet-18 architecture, and term this model as ResGene-T. We evaluate our proposed models on three crop species having ten phenotypic traits and compare it with seven most popular models (two statistical, two machine learning, and three deep learning). ResGene-T performs the best among all these seven methods (gains from 14.51% to 41.51%).

ResGene-T: A Tensor-Based Residual Network Approach for Genomic Prediction

TL;DR

This work proposes a new deep learning model for Genomic Prediction (GP), which involves correlating genotypic data with phenotypic traits and proposes a novel idea of converting the 2D-image into a 3D/ tensor and feed this to the ResNet-18 architecture, and term this model as ResGene-T.

Abstract

In this work, we propose a new deep learning model for Genomic Prediction (GP), which involves correlating genotypic data with phenotypic. The genotypes are typically fed as a sequence of characters to the 1D-Convolution Neural Network layer of the underlying deep learning model. Inspired by earlier work that represented genotype as a 2D-image for genotype-phenotype classification, we extend this idea to GP, which is a regression task. We use a ResNet-18 as the underlying architecture, and term this model as ResGene-2D. Although the 2D-image representation captures biological interactions well, it requires all the layers of the model to do so. This limits training efficiency. Thus, as seen in the earlier work that proposed a 2D-image representation, our ResGene-2D performs almost the same as other models (3% improvement). To overcome this, we propose a novel idea of converting the 2D-image into a 3D/ tensor and feed this to the ResNet-18 architecture, and term this model as ResGene-T. We evaluate our proposed models on three crop species having ten phenotypic traits and compare it with seven most popular models (two statistical, two machine learning, and three deep learning). ResGene-T performs the best among all these seven methods (gains from 14.51% to 41.51%).
Paper Structure (16 sections, 3 equations, 6 figures, 8 tables)

This paper contains 16 sections, 3 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: Pipeline for genotype to tensor image transformation. (a) Original genotype represented as a sequence of characters. (b) Conversion of the genotype sequence into a 2D-image. (c) Transformation of the 2D-image into a tensor representation.
  • Figure 2: Biological interaction among different SNPs.
  • Figure 3: ResNet-18 architecture.
  • Figure 4: Model Performance on Soybean Dataset.
  • Figure 5: Model Performance on Rice Dataset.
  • ...and 1 more figures