Corpora@DipInTra is a sharing hub giving unified access to the textual databases developed at the Corpora, Linguistics, Technology Research centre (CoLiTec corpora). Through this interface, the CoLiTec corpora are made widely available to the academic community, as well as other interested stakeholders, like students, language professionals and companies. These include large web-derived corpora and small but highly specialised ones, e.g.: a) acWaC (academic Web-as-Corpus), a pool of corpus resources to study institutional-academic language; b) WaCky (Web-As-Corpus Kool Yinitiative), a collection of large corpora built by automatically downloading texts from the English, French, German and Italian web; and c) La Repubblica, a corpus of Italian newspaper texts published between 1985 and 2000. The interface itself is based on the NoSketch Engine platform, a state-of-the-art, open-source tool for corpus management, providing a powerful and user-friendly interface to perform corpus searches, generate word/keyword lists and retrieve collocations based on several statistical measures.

Bernardini, S., Ferraresi, A., Zanchetta, E., Dalan, E. (2016). Corpora@DipInTra (Interface to the CoLiTec corpora).

Corpora@DipInTra (Interface to the CoLiTec corpora)

BERNARDINI, SILVIA;FERRARESI, ADRIANO;ZANCHETTA, EROS;DALAN, ERIKA
2016

Abstract

Corpora@DipInTra is a sharing hub giving unified access to the textual databases developed at the Corpora, Linguistics, Technology Research centre (CoLiTec corpora). Through this interface, the CoLiTec corpora are made widely available to the academic community, as well as other interested stakeholders, like students, language professionals and companies. These include large web-derived corpora and small but highly specialised ones, e.g.: a) acWaC (academic Web-as-Corpus), a pool of corpus resources to study institutional-academic language; b) WaCky (Web-As-Corpus Kool Yinitiative), a collection of large corpora built by automatically downloading texts from the English, French, German and Italian web; and c) La Repubblica, a corpus of Italian newspaper texts published between 1985 and 2000. The interface itself is based on the NoSketch Engine platform, a state-of-the-art, open-source tool for corpus management, providing a powerful and user-friendly interface to perform corpus searches, generate word/keyword lists and retrieve collocations based on several statistical measures.
2016
Bernardini, S., Ferraresi, A., Zanchetta, E., Dalan, E. (2016). Corpora@DipInTra (Interface to the CoLiTec corpora).
Bernardini, Silvia; Ferraresi, Adriano; Zanchetta, Eros; Dalan, Erika
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/600415
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact