The focus of this work is the use of ensembles of classifiers for predicting HIV protease cleavage sites in proteins. Due to the complex relationships in the biological data, several recent works show that often ensembles of learning algorithms outperform stand-alone methods. We show that the fusion of approaches based on different encoding models can be useful for improving the performance of this classification problem. In particular, in this work four different feature encodings for peptides are described and tested. An extensive evaluation on a large dataset according to a blind testing protocol is reported which demonstrates how different feature extraction methods and classifiers can be combined for obtaining a robust and reliable system. The comparison with other stand-alone approaches allows to quantify the performance improvement obtained by the ensembles proposed in this work.
Nanni, L., Lumini, A. (2009). Using ensemble of classifiers for predicting HIV protease cleavage sites in proteins. AMINO ACIDS, 36, 409-416 [10.1007/s00726-008-0076-z].
Using ensemble of classifiers for predicting HIV protease cleavage sites in proteins
NANNI, LORIS;LUMINI, ALESSANDRA
2009
Abstract
The focus of this work is the use of ensembles of classifiers for predicting HIV protease cleavage sites in proteins. Due to the complex relationships in the biological data, several recent works show that often ensembles of learning algorithms outperform stand-alone methods. We show that the fusion of approaches based on different encoding models can be useful for improving the performance of this classification problem. In particular, in this work four different feature encodings for peptides are described and tested. An extensive evaluation on a large dataset according to a blind testing protocol is reported which demonstrates how different feature extraction methods and classifiers can be combined for obtaining a robust and reliable system. The comparison with other stand-alone approaches allows to quantify the performance improvement obtained by the ensembles proposed in this work.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.