Nowadays, researchers unanimously agree on the undeniable importance of mental health. However, the literature related to tracking mental disorders in textual content from social media platforms is heavily inclined towards specific problems. In particular, panic disorder/panic attacks are heavily understudied in the current literature and the relevant resources are missing. Therefore, in this work we focus on collecting an annotated dataset. To this end, in order to mitigate the annotation effort by selectively annotating unlabeled data, we propose an active-learning based approach with uncertainty sampling supported by contextualized (Transformer-based) representations, symptomatic and psychometric features and domain expertise. Our evaluation demonstrates the efficiency of the proposed approach both in terms of classification accuracy and predictions confidence. Our contribution to the research community is an annotated dataset of 13,036 tweets that distinguishes between personal panicking experiences such as panic attacks, other panic-related content and completely panic-unrelated content hoping that it will foster research on the topic.
Mitrovic S., Frisone F., Gupta S., Lucifora C., Carapic D., Schillaci C., et al. (2023). Annotating Panic in Social Media using Active Learning, Transformers and Domain Knowledge. 10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA : IEEE COMPUTER SOC [10.1109/ICDMW60847.2023.00164].
Annotating Panic in Social Media using Active Learning, Transformers and Domain Knowledge
Lucifora C.;
2023
Abstract
Nowadays, researchers unanimously agree on the undeniable importance of mental health. However, the literature related to tracking mental disorders in textual content from social media platforms is heavily inclined towards specific problems. In particular, panic disorder/panic attacks are heavily understudied in the current literature and the relevant resources are missing. Therefore, in this work we focus on collecting an annotated dataset. To this end, in order to mitigate the annotation effort by selectively annotating unlabeled data, we propose an active-learning based approach with uncertainty sampling supported by contextualized (Transformer-based) representations, symptomatic and psychometric features and domain expertise. Our evaluation demonstrates the efficiency of the proposed approach both in terms of classification accuracy and predictions confidence. Our contribution to the research community is an annotated dataset of 13,036 tweets that distinguishes between personal panicking experiences such as panic attacks, other panic-related content and completely panic-unrelated content hoping that it will foster research on the topic.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.