Variation of gene content and gene expression in terms of relative quantitative expression and tissue/organ specificity is a substantial factor affecting phenotypic diversity. In crops, particularly in cereals, the pantran scriptome and pan - genome concepts raised soon after the reference genomes were made available. Characterizing the gene expression presence - absence variation (ePAV) of tetraploid durum wheat ( Triticum turgidum ssp. durum) enables to investigate the associat ion between the genotypic and phenotypic variation at an unprecedented level of precision. The current study presents the transcriptome analysis for 13 elite varieties from worldwide germplasm spanning from 1969 up to 2005. We aim to describe the gene expr ession variation in relation to a high - quality reference genome sequence assembly of durum wheat cv. Svevo (c/o International Durum Wheat Genome Sequencing Consortium). cDNA libraries for the 13 varieties were produced from roots and leaves at the seedling stage, and developing grains. In order to study the gene expression pattern, RNA - seq libraries of 13 varieties were aligned to the durum wheat genome using HISAT2. The transcript abundance was calculated using StringTie and Ballgown. Expression matrix was then normalized using R package DESeq2 for further clustering and variance analysis. The annotation of the Svevo assembled genome resulted in 66,559 high confidence gene models. Overall, 75.0% (48,007 genes), 70.5% (45,142) and 74.5% (47,702) of genes wer e expressed in grain, leaf and roots, respectively. Considering the cultivars and the tissue/organ libraries overall, the percentage of genes mapped to the Svevo genome reference varied from 48.0% (Altar84, Capeiti 8, Claudio, Saragolla leaves) to a maximu m of 61.0% (Meridiano, Strongfield roots and grains). Principal Component Analysis (PCA) analysis showed a clear gene expression clustering lead by organs (leaves, grains and roots accounting for 33.0 % variance). Hierarchical clustering based on the stron gest PC1 – PC2 scores clearly differentiated up - and down - regulated gene clusters based on tissues and varieties. Variance expression analysis projected on the Svevo assembly allowed us to identify the chromosome regions that drove the major expression varia tion patterns. Presence/absence of expression polymorphism could also be observed for several of the genes sorted by PCA. Interestingly, by clustering the gene expression profiles and the cultivar’s expression profiles several gene expression patterns rela ted to the ancestry relationship among cultivars were evidenced, particularly for the grains. The functional annotation of these gene clusters is ongoing. Towards assembly of a pan - transcriptome in durum, the cultivar - specific reads that could not be mappe d on the Svevo genome (4 - 30% referred to the Svevo Illumina sequencing data) are being de novo assembled. This expression pattern database could also be useful to identify genes regulated by eQTLs and to elucidate the function of candidate genes

TRANSCRIPTOME ANALYSIS OF 13 ELITE DURUM WHEAT VARIETIES SPANNING THE COMPLETE BREEDING ERA DECADES

ORMANBEKOVA D.;MACCAFERRI M.;TUBEROSA R.
2017

Abstract

Variation of gene content and gene expression in terms of relative quantitative expression and tissue/organ specificity is a substantial factor affecting phenotypic diversity. In crops, particularly in cereals, the pantran scriptome and pan - genome concepts raised soon after the reference genomes were made available. Characterizing the gene expression presence - absence variation (ePAV) of tetraploid durum wheat ( Triticum turgidum ssp. durum) enables to investigate the associat ion between the genotypic and phenotypic variation at an unprecedented level of precision. The current study presents the transcriptome analysis for 13 elite varieties from worldwide germplasm spanning from 1969 up to 2005. We aim to describe the gene expr ession variation in relation to a high - quality reference genome sequence assembly of durum wheat cv. Svevo (c/o International Durum Wheat Genome Sequencing Consortium). cDNA libraries for the 13 varieties were produced from roots and leaves at the seedling stage, and developing grains. In order to study the gene expression pattern, RNA - seq libraries of 13 varieties were aligned to the durum wheat genome using HISAT2. The transcript abundance was calculated using StringTie and Ballgown. Expression matrix was then normalized using R package DESeq2 for further clustering and variance analysis. The annotation of the Svevo assembled genome resulted in 66,559 high confidence gene models. Overall, 75.0% (48,007 genes), 70.5% (45,142) and 74.5% (47,702) of genes wer e expressed in grain, leaf and roots, respectively. Considering the cultivars and the tissue/organ libraries overall, the percentage of genes mapped to the Svevo genome reference varied from 48.0% (Altar84, Capeiti 8, Claudio, Saragolla leaves) to a maximu m of 61.0% (Meridiano, Strongfield roots and grains). Principal Component Analysis (PCA) analysis showed a clear gene expression clustering lead by organs (leaves, grains and roots accounting for 33.0 % variance). Hierarchical clustering based on the stron gest PC1 – PC2 scores clearly differentiated up - and down - regulated gene clusters based on tissues and varieties. Variance expression analysis projected on the Svevo assembly allowed us to identify the chromosome regions that drove the major expression varia tion patterns. Presence/absence of expression polymorphism could also be observed for several of the genes sorted by PCA. Interestingly, by clustering the gene expression profiles and the cultivar’s expression profiles several gene expression patterns rela ted to the ancestry relationship among cultivars were evidenced, particularly for the grains. The functional annotation of these gene clusters is ongoing. Towards assembly of a pan - transcriptome in durum, the cultivar - specific reads that could not be mappe d on the Svevo genome (4 - 30% referred to the Svevo Illumina sequencing data) are being de novo assembled. This expression pattern database could also be useful to identify genes regulated by eQTLs and to elucidate the function of candidate genes
2017
Proceedings of the Joint Congress SIBV - SIGA
Ormanbekova, D.; Maccaferri, M.; Twardziok, S. O.; Gundlach, H.; Vendramin, V.; Scalabrin, S.; Scaglione, D.; Mantovani, P.; Massi, A.; Mayer, K. F. X.; Morgante, M.; Tuberosa, R.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/618413
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact