The incomplete determination of the mRNA 5' end sequence may lead to the incorrect assignment of the first AUG codon and to errors in the prediction of the encoded protein product. Due to the significance of the mouse as a model organism in biomedical research, we performed a systematic identification of coding regions at the 5' end of all known mouse mRNAs, using an automated expressed sequence tag (EST)-based approach which we have previously described. By parsing almost 4 million BLAT alignments we found 351 mouse loci, out of 20,221 analyzed, in which an extension of the mRNA 5' coding region was identified. Proof-of-concept confirmation was obtained by in vitro cloning and sequencing for Apc2 and Mknk2 cDNAs. We also generated a list of 16,330 mouse mRNAs where the presence of an in-frame stop codon upstream of the known start codon indicates completeness of the coding sequence at 5' end in the current form. Systematic searches in the main mouse genome databases and genome browsers showed that 82 % of our results are original and have not been identified by their annotation pipelines. Moreover, the same information is not easily derivable from RNA-Seq data, due to short sequence length and laboriousness in building full-length transcript structures. In conclusion, our results improve the determination of full-length 5' coding sequences and might be useful in order to reduce errors when studying mouse gene structure and function in biomedical research.

Improving mRNA 5′ coding sequence determination in the mouse genome / Allison Piovesan;Maria Caracausi;Maria Chiara Pelleri;Lorenza Vitale;Silvia Martini;Chiara Bassani;Annalisa Gurioli;Raffaella Casadei;Giulia Soldà;Pierluigi Strippoli. - In: MAMMALIAN GENOME. - ISSN 0938-8990. - STAMPA. - 25:(2014), pp. 149-159. [10.1007/s00335-013-9498-3]

Improving mRNA 5′ coding sequence determination in the mouse genome

PIOVESAN, ALLISON;CARACAUSI, MARIA;PELLERI, MARIA CHIARA;VITALE, LORENZA;MARTINI, SILVIA;CASADEI, RAFFAELLA;STRIPPOLI, PIERLUIGI
2014

Abstract

The incomplete determination of the mRNA 5' end sequence may lead to the incorrect assignment of the first AUG codon and to errors in the prediction of the encoded protein product. Due to the significance of the mouse as a model organism in biomedical research, we performed a systematic identification of coding regions at the 5' end of all known mouse mRNAs, using an automated expressed sequence tag (EST)-based approach which we have previously described. By parsing almost 4 million BLAT alignments we found 351 mouse loci, out of 20,221 analyzed, in which an extension of the mRNA 5' coding region was identified. Proof-of-concept confirmation was obtained by in vitro cloning and sequencing for Apc2 and Mknk2 cDNAs. We also generated a list of 16,330 mouse mRNAs where the presence of an in-frame stop codon upstream of the known start codon indicates completeness of the coding sequence at 5' end in the current form. Systematic searches in the main mouse genome databases and genome browsers showed that 82 % of our results are original and have not been identified by their annotation pipelines. Moreover, the same information is not easily derivable from RNA-Seq data, due to short sequence length and laboriousness in building full-length transcript structures. In conclusion, our results improve the determination of full-length 5' coding sequences and might be useful in order to reduce errors when studying mouse gene structure and function in biomedical research.
2014
Improving mRNA 5′ coding sequence determination in the mouse genome / Allison Piovesan;Maria Caracausi;Maria Chiara Pelleri;Lorenza Vitale;Silvia Martini;Chiara Bassani;Annalisa Gurioli;Raffaella Casadei;Giulia Soldà;Pierluigi Strippoli. - In: MAMMALIAN GENOME. - ISSN 0938-8990. - STAMPA. - 25:(2014), pp. 149-159. [10.1007/s00335-013-9498-3]
Allison Piovesan;Maria Caracausi;Maria Chiara Pelleri;Lorenza Vitale;Silvia Martini;Chiara Bassani;Annalisa Gurioli;Raffaella Casadei;Giulia Soldà;Pierluigi Strippoli
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/373609
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 2
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 4
social impact