The monitoring of the expression profiles of thousands of genes have proved to be particularly promising for biological classification. DNA microarray data have been recently used for the development of classification rules, particularly for cancer diagnosis. However, microarray data present major challenges due to the complex, multiclass nature and the overwhelming number of variables characterizing gene expression profiles. A regularized form of sliced inverse regression (REGSIR) approach is proposed. It allows the simultaneous development of classification rules and the selection of those genes that are most important in terms of classification accuracy. The method is illustrated on some publicly available microarray data sets. Furthermore, an extensive comparison with other classification methods is reported. The REGSIR performance is comparable with the best classification methods available, and when appropriate feature selection is made the performance can be considerably improved. © 2007 Elsevier B.V. All rights reserved.
Scrucca, L. (2007). Class prediction and gene selection for DNA microarrays using regularized sliced inverse regression. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 52(1), 438-451 [10.1016/j.csda.2007.02.005].
Class prediction and gene selection for DNA microarrays using regularized sliced inverse regression
Scrucca L.
2007
Abstract
The monitoring of the expression profiles of thousands of genes have proved to be particularly promising for biological classification. DNA microarray data have been recently used for the development of classification rules, particularly for cancer diagnosis. However, microarray data present major challenges due to the complex, multiclass nature and the overwhelming number of variables characterizing gene expression profiles. A regularized form of sliced inverse regression (REGSIR) approach is proposed. It allows the simultaneous development of classification rules and the selection of those genes that are most important in terms of classification accuracy. The method is illustrated on some publicly available microarray data sets. Furthermore, an extensive comparison with other classification methods is reported. The REGSIR performance is comparable with the best classification methods available, and when appropriate feature selection is made the performance can be considerably improved. © 2007 Elsevier B.V. All rights reserved.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.