PAISA' is a Creative Commons licensed, large web corpus of contemporary Italian. We describe the design, harvesting, and processing steps involved in its creation.
Verena Lyding, Egon Stemle, Claudia Borghetti, Marco Brunello, Sara Castagnoli, Felice Dell'Orletta, et al. (2014). The PAISA' Corpus of Italian Web Texts. Stroudsburg, PA : Association for Computational Linguistics.
The PAISA' Corpus of Italian Web Texts
BORGHETTI, CLAUDIA;CASTAGNOLI, SARA;
2014
Abstract
PAISA' is a Creative Commons licensed, large web corpus of contemporary Italian. We describe the design, harvesting, and processing steps involved in its creation.File in questo prodotto:
Eventuali allegati, non sono esposti
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.