Two independent methods are used to evaluate the protein-coding information content in different classes of DNA sequences.The first method allows to evaluate the statistical relevance of finding unidentified reading frames, longer than 100 codons, on both DNA strands of: a) 117 DNA sequences that code for 142 nuclear proteins; b) 39 stable RNA coding sequences and c) 36 other DNA sequences which include regulatory and as yet unknown function sequences. The finding of 50 reading frames longer than 100 codons (complementary inverted proteins or c.i.p. genes) located on the DNA strand complementary to the protein-coding one is drastically in excess of the number predicted by chance alone.An independent method (testcode) applied to c.i.p. gene sequences, which assigns the probability of coding to a given sequence, predicts that more than 50% of these genes are translated in a functional product.These analyses indicate the existence of a new class of protein-coding genes, located on the DNA sequences complementary to the protein-coding DNA strand. © 1984 IRL Press Limited.
Tramontano A., Scarlato V., Barni N., Cipollaro M., Franze A., Macchiato M.F., et al. (1984). Statistical evaluation of the coding capacity of complementary DNA strands. NUCLEIC ACIDS RESEARCH, 12(12), 5049-5059 [10.1093/nar/12.12.5049].
Statistical evaluation of the coding capacity of complementary DNA strands
Scarlato V.;
1984
Abstract
Two independent methods are used to evaluate the protein-coding information content in different classes of DNA sequences.The first method allows to evaluate the statistical relevance of finding unidentified reading frames, longer than 100 codons, on both DNA strands of: a) 117 DNA sequences that code for 142 nuclear proteins; b) 39 stable RNA coding sequences and c) 36 other DNA sequences which include regulatory and as yet unknown function sequences. The finding of 50 reading frames longer than 100 codons (complementary inverted proteins or c.i.p. genes) located on the DNA strand complementary to the protein-coding one is drastically in excess of the number predicted by chance alone.An independent method (testcode) applied to c.i.p. gene sequences, which assigns the probability of coding to a given sequence, predicts that more than 50% of these genes are translated in a functional product.These analyses indicate the existence of a new class of protein-coding genes, located on the DNA sequences complementary to the protein-coding DNA strand. © 1984 IRL Press Limited.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.