Learning knowledge from text is becoming increasingly important as the amount of unstructured content on the Web rapidly grows. Despite recent breakthroughs in natural language understanding, the explanation of phenomena from textual documents is still a difficult and poorly addressed problem. Additionally, current NLP solutions often require labeled data, are domain-dependent, and based on black box models. In this paper, we introduce POIROT, a new descriptive text mining methodology for phenomena explanation from documents corpora. POIROT is designed to provide accurate and interpretable results in unsupervised settings, quantifying them based on their statistical significance. We evaluated POIROT on a medical case study, with the aim of learning the “voice of patients” from short social posts. Taking Esophageal Achalasia as a reference, we automatically derived scientific correlations with 79% F1-measure score and built useful explanations of the patients’ viewpoint on topics such as symptoms, treatments, drugs, and foods. We make the source code and experiment details publicly available (https://github.com/unibodatascience/POIROT).

Giacomo Frisoni, Gianluca Moro (2021). Phenomena Explanation from Text: Unsupervised Learning of Interpretable and Statistically Significant Knowledge. Springer Science and Business Media Deutschland GmbH [10.1007/978-3-030-83014-4_14].

Phenomena Explanation from Text: Unsupervised Learning of Interpretable and Statistically Significant Knowledge

Giacomo Frisoni;Gianluca Moro
2021

Abstract

Learning knowledge from text is becoming increasingly important as the amount of unstructured content on the Web rapidly grows. Despite recent breakthroughs in natural language understanding, the explanation of phenomena from textual documents is still a difficult and poorly addressed problem. Additionally, current NLP solutions often require labeled data, are domain-dependent, and based on black box models. In this paper, we introduce POIROT, a new descriptive text mining methodology for phenomena explanation from documents corpora. POIROT is designed to provide accurate and interpretable results in unsupervised settings, quantifying them based on their statistical significance. We evaluated POIROT on a medical case study, with the aim of learning the “voice of patients” from short social posts. Taking Esophageal Achalasia as a reference, we automatically derived scientific correlations with 79% F1-measure score and built useful explanations of the patients’ viewpoint on topics such as symptoms, treatments, drugs, and foods. We make the source code and experiment details publicly available (https://github.com/unibodatascience/POIROT).
2021
Communications in Computer and Information Science
293
318
Giacomo Frisoni, Gianluca Moro (2021). Phenomena Explanation from Text: Unsupervised Learning of Interpretable and Statistically Significant Knowledge. Springer Science and Business Media Deutschland GmbH [10.1007/978-3-030-83014-4_14].
Giacomo Frisoni; Gianluca Moro
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/917627
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 21
  • ???jsp.display-item.citation.isi??? ND
social impact