CRIS Current Research Information System

Learning knowledge from text is becoming increasingly important as the amount of unstructured content on the Web rapidly grows. Despite recent breakthroughs in natural language understanding, the explanation of phenomena from textual documents is still a difficult and poorly addressed problem. Additionally, current NLP solutions often require labeled data, are domain-dependent, and based on black box models. In this paper, we introduce POIROT, a new descriptive text mining methodology for phenomena explanation from documents corpora. POIROT is designed to provide accurate and interpretable results in unsupervised settings, quantifying them based on their statistical significance. We evaluated POIROT on a medical case study, with the aim of learning the “voice of patients” from short social posts. Taking Esophageal Achalasia as a reference, we automatically derived scientific correlations with 79% F1-measure score and built useful explanations of the patients’ viewpoint on topics such as symptoms, treatments, drugs, and foods. We make the source code and experiment details publicly available (https://github.com/unibodatascience/POIROT).

Giacomo Frisoni, Gianluca Moro (2021). Phenomena Explanation from Text: Unsupervised Learning of Interpretable and Statistically Significant Knowledge. Springer Science and Business Media Deutschland GmbH [10.1007/978-3-030-83014-4_14].

Phenomena Explanation from Text: Unsupervised Learning of Interpretable and Statistically Significant Knowledge

Giacomo Frisoni;Gianluca Moro

2021

Abstract

Learning knowledge from text is becoming increasingly important as the amount of unstructured content on the Web rapidly grows. Despite recent breakthroughs in natural language understanding, the explanation of phenomena from textual documents is still a difficult and poorly addressed problem. Additionally, current NLP solutions often require labeled data, are domain-dependent, and based on black box models. In this paper, we introduce POIROT, a new descriptive text mining methodology for phenomena explanation from documents corpora. POIROT is designed to provide accurate and interpretable results in unsupervised settings, quantifying them based on their statistical significance. We evaluated POIROT on a medical case study, with the aim of learning the “voice of patients” from short social posts. Taking Esophageal Achalasia as a reference, we automatically derived scientific correlations with 79% F1-measure score and built useful explanations of the patients’ viewpoint on topics such as symptoms, treatments, drugs, and foods. We make the source code and experiment details publicly available (https://github.com/unibodatascience/POIROT).

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Titolo del volume
	
				Communications in Computer and Information Science
			
	Pagina iniziale
	
				293
			
	Pagina finale
	
				318
			
	Collana/Serie
	
				COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE
			
	Codice DOI
	
				https://dx.doi.org/10.1007/978-3-030-83014-4_14
			
	Citazione
	
				Giacomo Frisoni,  Gianluca Moro (2021). Phenomena Explanation from Text: Unsupervised Learning of Interpretable and Statistically Significant Knowledge. Springer Science and Business Media Deutschland GmbH [10.1007/978-3-030-83014-4_14].
			
	Tutti gli autori
	
						Giacomo Frisoni; Gianluca Moro

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/917627

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

21

ND

social impact