Objective: A well-known limit of genome browsers is that the large amount of genome and gene data is not organized in the form of a searchable database, hampering full management of numerical data and free calculations. Due to the continuous increase of data deposited in genomic repositories, their content revision and analysis is recommended. Using GeneBase, a software with a graphical interface able to import and elaborate National Center for Biotechnology Information (NCBI) Gene database entries, we provide tabulated spreadsheets updated to 2019 about human nuclear protein-coding gene data set ready to be used for any type of analysis about genes, transcripts and gene organization. Results: Comparison with previous reports reveals substantial change in the number of known nuclear protein-coding genes (now 19,116), the protein-coding non-redundant transcriptome space [now 59,281,518 base pair (bp), 10.1% increase], the number of exons (now 562,164, 36.2% increase) due to a relevant increase of the RNA isoforms recorded. Other parameters such as gene, exon or intron mean and extreme length appear to have reached a stability that is unlikely to be substantially modified by human genome data updates, at least regarding protein-coding genes. Finally, we confirm that there are no human introns shorter than 30 bp.

Piovesan A., Antonaros F., Vitale L., Strippoli P., Pelleri M.C., Caracausi M. (2019). Human protein-coding genes and gene feature statistics in 2019. BMC RESEARCH NOTES, 12(1), 1-5 [10.1186/s13104-019-4343-8].

Human protein-coding genes and gene feature statistics in 2019.

Piovesan A.;Antonaros F.;Vitale L.;Strippoli P.;Pelleri M. C.
;
Caracausi M.
2019

Abstract

Objective: A well-known limit of genome browsers is that the large amount of genome and gene data is not organized in the form of a searchable database, hampering full management of numerical data and free calculations. Due to the continuous increase of data deposited in genomic repositories, their content revision and analysis is recommended. Using GeneBase, a software with a graphical interface able to import and elaborate National Center for Biotechnology Information (NCBI) Gene database entries, we provide tabulated spreadsheets updated to 2019 about human nuclear protein-coding gene data set ready to be used for any type of analysis about genes, transcripts and gene organization. Results: Comparison with previous reports reveals substantial change in the number of known nuclear protein-coding genes (now 19,116), the protein-coding non-redundant transcriptome space [now 59,281,518 base pair (bp), 10.1% increase], the number of exons (now 562,164, 36.2% increase) due to a relevant increase of the RNA isoforms recorded. Other parameters such as gene, exon or intron mean and extreme length appear to have reached a stability that is unlikely to be substantially modified by human genome data updates, at least regarding protein-coding genes. Finally, we confirm that there are no human introns shorter than 30 bp.
2019
Piovesan A., Antonaros F., Vitale L., Strippoli P., Pelleri M.C., Caracausi M. (2019). Human protein-coding genes and gene feature statistics in 2019. BMC RESEARCH NOTES, 12(1), 1-5 [10.1186/s13104-019-4343-8].
Piovesan A.; Antonaros F.; Vitale L.; Strippoli P.; Pelleri M.C.; Caracausi M.
File in questo prodotto:
File Dimensione Formato  
Piovesan 2019 Human Genes 2019.pdf

accesso aperto

Tipo: Versione (PDF) editoriale
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 580.97 kB
Formato Adobe PDF
580.97 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/697666
Citazioni
  • ???jsp.display-item.citation.pmc??? 61
  • Scopus 103
  • ???jsp.display-item.citation.isi??? 86
social impact