Motivation: The sequencing of the pig genome that has grown rapidly in recent years, is expected to impact both human medicine and pork production. As a matter of fact, pig is an important biomedical model and, for its similarity in several aspects to humans, is more relevant than mouse to human health research priorities such as obesity and diabetes. On the other hand, pig constitutes one of the most important meat protein sources for human consumptions worldwide. Identification of polymorphisms at the whole genome level provides the tools to connect genetic variations to complex phenotypic traits and to understand the biological aspects underlying the pig productions and the use of this species as human model. Single nucleotide polymorphisms (SNPs) are now the markers of choice for high throughput genotyping due to the possibility to design and automate SNP assays, for the stability of these markers, and their density in the genome. In silico SNPs discovery in expressed genes makes use of sequence assembling and aligning, starting from large sets of expressed sequence tags (ESTs). Sequence databases contain more than three millions of porcine ESTs than can be mined and selected to this purpose. Methods:We developed an integrated platform for detecting pig SNPs in user-specified fashions using specifically designed query tools that can select genes involved or predicted to have a role in different processes. The pipeline runs in batch mode and includes BLAST package to cluster ESTs and to align identified SNPs with partial porcine genomic sequences, PHRAP to build contigs, CROSS_MATCH and RepeatMasker to remove vectors and repeats. In the pipeline all the components are connected by shell and perl scripts. Results:The pipeline was first tested on a set of 50 human obesity related genes yielding 177 porcine contigs. Our results highlight 226 biallelic SNPs. The EST contigs were mapped on porcine genomic sequences obtaining sequence information for the design of specific genotyping assays. Genotyping of a sub-set of the putative/computed SNPs is under way. We therefore propose our platform as a suitable tool to select SNPs within a large number of porcine candidate genes chosen according to specific requirements of the end-user. Furthermore, the platform, integrating also other systems biology tool will offer an invaluable help for large genome screening also for any complete genome.

An integrated bioinformatic platform for SNPs detection: the test case of the pig genome

FRONZA, RAFFAELE;FONTANESI, LUCA;RUSSO, VINCENZO;CASADIO, RITA
2009

Abstract

Motivation: The sequencing of the pig genome that has grown rapidly in recent years, is expected to impact both human medicine and pork production. As a matter of fact, pig is an important biomedical model and, for its similarity in several aspects to humans, is more relevant than mouse to human health research priorities such as obesity and diabetes. On the other hand, pig constitutes one of the most important meat protein sources for human consumptions worldwide. Identification of polymorphisms at the whole genome level provides the tools to connect genetic variations to complex phenotypic traits and to understand the biological aspects underlying the pig productions and the use of this species as human model. Single nucleotide polymorphisms (SNPs) are now the markers of choice for high throughput genotyping due to the possibility to design and automate SNP assays, for the stability of these markers, and their density in the genome. In silico SNPs discovery in expressed genes makes use of sequence assembling and aligning, starting from large sets of expressed sequence tags (ESTs). Sequence databases contain more than three millions of porcine ESTs than can be mined and selected to this purpose. Methods:We developed an integrated platform for detecting pig SNPs in user-specified fashions using specifically designed query tools that can select genes involved or predicted to have a role in different processes. The pipeline runs in batch mode and includes BLAST package to cluster ESTs and to align identified SNPs with partial porcine genomic sequences, PHRAP to build contigs, CROSS_MATCH and RepeatMasker to remove vectors and repeats. In the pipeline all the components are connected by shell and perl scripts. Results:The pipeline was first tested on a set of 50 human obesity related genes yielding 177 porcine contigs. Our results highlight 226 biallelic SNPs. The EST contigs were mapped on porcine genomic sequences obtaining sequence information for the design of specific genotyping assays. Genotyping of a sub-set of the putative/computed SNPs is under way. We therefore propose our platform as a suitable tool to select SNPs within a large number of porcine candidate genes chosen according to specific requirements of the end-user. Furthermore, the platform, integrating also other systems biology tool will offer an invaluable help for large genome screening also for any complete genome.
2009
Proceedings of BITS 09
X
X
Fronza R.; Fontanesi L.; Russo V.; Casadio R.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/85661
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact