In this paper we have made an extensive study of artificial intelligence (AI) techniques like ensemble of classifiers and feature selection for the identification of students with learning disabilities. The experimental results show that our best method, that combines both ensemble of classifiers and feature selection, can correctly identify up to 50% of the learning disabilities (LD) students with 100% confidence. Also when predicting samples in “junior high school” using model built on the “elementary school” students and when the “junior high school” samples are used to built the model and we predict the samples in the “elementary school” dataset. In particular, we propose variants of two recent Feature Transform based ensemble methods (Rotation Forest and Input Decimated Ensemble). In the Rotation Forest the feature set is randomly split into subsets and Principal Component Analysis (PCA) is used to transform the features that belong to a subset. The Input Decimated Ensemble first singles out a given class i and runs PCA on this data only. This transformation is applied to the whole dataset and a classifier Di is trained using these transformed patterns. This choice limits the size of the ensemble to the number of classes. In this paper, we perform an empirical comparison varying the Feature Transform method used in the Rotation Forest technique and we propose a clustering method to overcome the drawback of the Input Decimated Ensemble.
L. Nanni, A.Lumini (2009). Ensemble generation and feature selection for the identification of students with learning disabilities. EXPERT SYSTEMS WITH APPLICATIONS, 36, 3896-3900 [10.1016/j.eswa.2008.02.065].
Ensemble generation and feature selection for the identification of students with learning disabilities
NANNI, LORIS;LUMINI, ALESSANDRA
2009
Abstract
In this paper we have made an extensive study of artificial intelligence (AI) techniques like ensemble of classifiers and feature selection for the identification of students with learning disabilities. The experimental results show that our best method, that combines both ensemble of classifiers and feature selection, can correctly identify up to 50% of the learning disabilities (LD) students with 100% confidence. Also when predicting samples in “junior high school” using model built on the “elementary school” students and when the “junior high school” samples are used to built the model and we predict the samples in the “elementary school” dataset. In particular, we propose variants of two recent Feature Transform based ensemble methods (Rotation Forest and Input Decimated Ensemble). In the Rotation Forest the feature set is randomly split into subsets and Principal Component Analysis (PCA) is used to transform the features that belong to a subset. The Input Decimated Ensemble first singles out a given class i and runs PCA on this data only. This transformation is applied to the whole dataset and a classifier Di is trained using these transformed patterns. This choice limits the size of the ensemble to the number of classes. In this paper, we perform an empirical comparison varying the Feature Transform method used in the Rotation Forest technique and we propose a clustering method to overcome the drawback of the Input Decimated Ensemble.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.