Academic dropout remains a significant challenge for education systems, necessitating rigorous analysis and targeted interventions. This study employs machine learning techniques, specifically random forest (RF) and feature tokenizer transformer (FTT), to predict academic attrition. Utilizing a comprehensive dataset of over 40 000 students from an Italian university, the research incorporates a range of variables, including demographic information, prior educational metrics, and real-time academic performance indicators. We present a nuanced comparative evaluation of the RF and FTT models, highlighting their predictive accuracy and interpretative capabilities. Our empirical results demonstrate the effectiveness of machine learning in managing student attrition, with FTT models outperforming RF models in terms of predictive accuracy and achieving a sensitivity rate of 81%. Significantly, the inclusion of historical academic data enhances the models' ability to identify students at increased risk of dropping out. Furthermore, we apply advanced explanatory techniques, such as shapley additive explanations, to investigate the discriminative power of these models across different student profiles. This provides valuable insights into the key variables influencing dropout risk, contributing to a more holistic understanding of the issue. In addition, we conduct a fairness analysis to ensure the ethical robustness of our predictive models, making them not only effective but also equitable tools.
Zanellati, A., Zingaro, S.P., Gabbrielli, M. (2024). Balancing Performance and Explainability in Academic Dropout Prediction. IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 17, 2086-2099 [10.1109/TLT.2024.3425959].
Balancing Performance and Explainability in Academic Dropout Prediction
Zanellati A.Investigation
;Zingaro S. P.
Methodology
;Gabbrielli M.Supervision
2024
Abstract
Academic dropout remains a significant challenge for education systems, necessitating rigorous analysis and targeted interventions. This study employs machine learning techniques, specifically random forest (RF) and feature tokenizer transformer (FTT), to predict academic attrition. Utilizing a comprehensive dataset of over 40 000 students from an Italian university, the research incorporates a range of variables, including demographic information, prior educational metrics, and real-time academic performance indicators. We present a nuanced comparative evaluation of the RF and FTT models, highlighting their predictive accuracy and interpretative capabilities. Our empirical results demonstrate the effectiveness of machine learning in managing student attrition, with FTT models outperforming RF models in terms of predictive accuracy and achieving a sensitivity rate of 81%. Significantly, the inclusion of historical academic data enhances the models' ability to identify students at increased risk of dropping out. Furthermore, we apply advanced explanatory techniques, such as shapley additive explanations, to investigate the discriminative power of these models across different student profiles. This provides valuable insights into the key variables influencing dropout risk, contributing to a more holistic understanding of the issue. In addition, we conduct a fairness analysis to ensure the ethical robustness of our predictive models, making them not only effective but also equitable tools.File | Dimensione | Formato | |
---|---|---|---|
Balancing_Performance_and_Explainability_in_Academic_Dropout_Prediction.pdf
accesso aperto
Tipo:
Versione (PDF) editoriale
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione
1.4 MB
Formato
Adobe PDF
|
1.4 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.