Using ensemble of classifiers in Bioinformatics

Nanni, Loris; Lumini, Alessandra

This chapter focuses on the use of ensembles of classifiers in Bioinformatics. Due to the complex relationships in the biological data several recent works show that often ensembles of learning algorithms outperform stand-alone methods. The main idea is that averaging the different hypotheses of the classifiers, the combined systems may produce a good approximation of the true hypothesis. After a short introduction on the basic concepts of the combination of classifiers, a detailed review of the existing literature is provided by discussing the most relevant ensemble approaches applied to protein, peptide and microarray classification. Various critical issues related to bioinformatics datasets are discussed and some suggestions on the design and testing of ensembles of classifiers for bioinformatics problems are given. Moreover some methods for evaluating the complementarities of the feature extraction techniques are discussed. Finally, several experimental results are presented in order to show how different feature extraction methods and classifiers based on different methodologies can be combined for obtaining a robust and reliable system. The comparison with other stat-of-the-art approaches allows to quantify the performance improvement obtained by the ensembles proposed in this work. In conclusion the aim of the present work is to point out some of the advantages and potentialities of using a multi-classifier system instead of a stand-alone method in several bioinformatics problems and drawing some promising research directions for the future.

L. Nanni, A. Lumini (2010). Using ensemble of classifiers in Bioinformatics. HAUPPAUGE, NY : Nova Publishers.