BACKGROUND: The present knowledge of protein structures at atomic level derives from some 60,000 molecules. Yet the exponential ever growing set of hypothetical protein sequences comprises some 10 million chains and this makes the problem of protein structure prediction one of the challenging goals of bioinformatics. In this context, the protein representation with contact maps is an intermediate step of fold recognition and constitutes the input of contact map predictors. However contact map representations require fast and reliable methods to reconstruct the specific folding of the protein backbone. METHODS: In this paper, by adopting a GRID technology, our algorithm for 3D reconstruction FT-COMAR is benchmarked on a huge set of non redundant proteins (1716) taking random noise into consideration and this makes our computation the largest ever performed for the task at hand. RESULTS: We can observe the effects of introducing random noise on 3D reconstruction and derive some considerations useful for future implementations. The dimension of the protein set allows also statistical considerations after grouping per SCOP structural classes. CONCLUSIONS: All together our data indicate that the quality of 3D reconstruction is unaffected by deleting up to an average 75% of the real contacts while only few percentage of randomly generated contacts in place of non-contacts are sufficient to hamper 3D reconstruction.
Vassura M., Di Lena P., Margara L., Mirto M., Aloisio G., Fariselli P., et al. (2011). Blurring contact maps of thousands of proteins: what we can learn by reconstructing 3D structure. BIODATA MINING, 4, .-. [10.1186/1756-0381-4-1].
Blurring contact maps of thousands of proteins: what we can learn by reconstructing 3D structure.
VASSURA, MARCO;DI LENA, PIETRO;MARGARA, LUCIANO;FARISELLI, PIERO;CASADIO, RITA
2011
Abstract
BACKGROUND: The present knowledge of protein structures at atomic level derives from some 60,000 molecules. Yet the exponential ever growing set of hypothetical protein sequences comprises some 10 million chains and this makes the problem of protein structure prediction one of the challenging goals of bioinformatics. In this context, the protein representation with contact maps is an intermediate step of fold recognition and constitutes the input of contact map predictors. However contact map representations require fast and reliable methods to reconstruct the specific folding of the protein backbone. METHODS: In this paper, by adopting a GRID technology, our algorithm for 3D reconstruction FT-COMAR is benchmarked on a huge set of non redundant proteins (1716) taking random noise into consideration and this makes our computation the largest ever performed for the task at hand. RESULTS: We can observe the effects of introducing random noise on 3D reconstruction and derive some considerations useful for future implementations. The dimension of the protein set allows also statistical considerations after grouping per SCOP structural classes. CONCLUSIONS: All together our data indicate that the quality of 3D reconstruction is unaffected by deleting up to an average 75% of the real contacts while only few percentage of randomly generated contacts in place of non-contacts are sufficient to hamper 3D reconstruction.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.