This paper presents the AnIta-Lemmatiser, an automatic tool to lem- matise Italian texts. It is based on a powerful morphological analyser enriched with a large lexicon and some heuristic techniques to select the most appropriate lemma among those that can be morphologically associated to an ambiguous wordform. The heuristics are essentially based on the frequency-of-use tags provided by the De Mauro/Paravia electronic dictionary. The AnIta-Lemmatiser ranked at the second place in the Lemmatisation Task of the EVALITA 2011 evaluation campaign. Beyond the official lemmatiser used for EVALITA, some further improvements are presented.
Tamburini F. (2013). The AnIta-Lemmatiser: a tool for accurate lemmatisation of Italian texts. Berlin Heidelberg : Springer Verlag [10.1007/978-3-642-35828-9].
The AnIta-Lemmatiser: a tool for accurate lemmatisation of Italian texts
TAMBURINI, FABIO
2013
Abstract
This paper presents the AnIta-Lemmatiser, an automatic tool to lem- matise Italian texts. It is based on a powerful morphological analyser enriched with a large lexicon and some heuristic techniques to select the most appropriate lemma among those that can be morphologically associated to an ambiguous wordform. The heuristics are essentially based on the frequency-of-use tags provided by the De Mauro/Paravia electronic dictionary. The AnIta-Lemmatiser ranked at the second place in the Lemmatisation Task of the EVALITA 2011 evaluation campaign. Beyond the official lemmatiser used for EVALITA, some further improvements are presented.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.