This study proposes a machine learning (ML) approach to classify users’ personalities based on their online comments, using the Myers-Briggs Type Indicator (MBTI) as the reference model. Unlike recent transformer based models, our method offers a faster and more cost effective solution without requiring substantial computational power or additional expenses. Initially, we trained an Extreme Gradient Boosting (XGBoost) classifier using the well known MBTI Kaggle dataset and enhanced its performance with the Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance. We then applied the trained model to real world data, specifically YouTube comments related to travel and spirituality, to demonstrate its effectiveness in capturing personality traits. The results highlight the potential of our approach for practical applications in personalized content and behavioral analysis. This study underscores the viability of traditional ML techniques in personality classification and paves the way for further research to improve model robustness and scalability.
Stracqualursi, L., Agati, P. (In stampa/Attività in corso). From Words to Personality: Machine Learning for MBTI Profiling. Cham : Springer Nature Switzerland AG.
From Words to Personality: Machine Learning for MBTI Profiling
Luisa Stracqualursi
Primo
;Patrizia AgatiSecondo
In corso di stampa
Abstract
This study proposes a machine learning (ML) approach to classify users’ personalities based on their online comments, using the Myers-Briggs Type Indicator (MBTI) as the reference model. Unlike recent transformer based models, our method offers a faster and more cost effective solution without requiring substantial computational power or additional expenses. Initially, we trained an Extreme Gradient Boosting (XGBoost) classifier using the well known MBTI Kaggle dataset and enhanced its performance with the Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance. We then applied the trained model to real world data, specifically YouTube comments related to travel and spirituality, to demonstrate its effectiveness in capturing personality traits. The results highlight the potential of our approach for practical applications in personalized content and behavioral analysis. This study underscores the viability of traditional ML techniques in personality classification and paves the way for further research to improve model robustness and scalability.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.