The task of assigning classification codes to short medical text is a hard text classification problem, especially when the set of possible codes is as big as the ICD-9-CM set. The problem, which has been only partially tamed for a subset of ICD-9-CM, becomes even harder in real world applications, where the labeled data are scarce and noisy. In this paper we first show the ineffectivenesss of current Text Classification algorithms on large datasets, then we present a novel incremental approach to clinical Text Classification, which overcomes the low accuracy problem through the top-K retrieval, exploits Transfer Learning techniques in order to expand a skewed dataset and improves the overall accuracy over time, learning from user selection.

ICD code retrieval: Novel approach for assisted disease classification / Rizzo, Stefano Giovanni; Montesi, Danilo; Fabbri, Andrea; Marchesini, Giulio. - ELETTRONICO. - 9162:(2015), pp. 147-161. (Intervento presentato al convegno 11th International Conference on Data Integration in the Life Sciences, DILS 2015 tenutosi a usa nel 2015) [10.1007/978-3-319-21843-4_12].

ICD code retrieval: Novel approach for assisted disease classification

RIZZO, STEFANO GIOVANNI;MONTESI, DANILO;FABBRI, ANDREA;MARCHESINI REGGIANI, GIULIO
2015

Abstract

The task of assigning classification codes to short medical text is a hard text classification problem, especially when the set of possible codes is as big as the ICD-9-CM set. The problem, which has been only partially tamed for a subset of ICD-9-CM, becomes even harder in real world applications, where the labeled data are scarce and noisy. In this paper we first show the ineffectivenesss of current Text Classification algorithms on large datasets, then we present a novel incremental approach to clinical Text Classification, which overcomes the low accuracy problem through the top-K retrieval, exploits Transfer Learning techniques in order to expand a skewed dataset and improves the overall accuracy over time, learning from user selection.
2015
Data Integration in the Life Sciences
147
161
ICD code retrieval: Novel approach for assisted disease classification / Rizzo, Stefano Giovanni; Montesi, Danilo; Fabbri, Andrea; Marchesini, Giulio. - ELETTRONICO. - 9162:(2015), pp. 147-161. (Intervento presentato al convegno 11th International Conference on Data Integration in the Life Sciences, DILS 2015 tenutosi a usa nel 2015) [10.1007/978-3-319-21843-4_12].
Rizzo, Stefano Giovanni; Montesi, Danilo; Fabbri, Andrea; Marchesini, Giulio
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/548445
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 9
social impact