We aim to automatically induce a PoS tagset for Italian by analysing the distributional behaviour of Italian words. To this end, we propose an algorithm that (a) extracts information from loosely labelled dependency structures that encode only basic and broadly accepted syntactic relations, namely Head/Dependent and the distinction of dependents into Argument vs. Adjunct, and (b) derives a possible set of word classes. The paper reports on some preliminary experiments carried out using the induced tagset in conjunction with state-of-the-art PoS taggers. The method proposed to design a proper tagset exploits little, if any, language-specific knowledge: hence it is in principle applicable to any language.

POS tagset design for Italian / Bernardi R.; Bolognesi A.; Seidenari C.; Tamburini F.. - STAMPA. - (2006), pp. 1396-1401. (Intervento presentato al convegno 5th International Conference on Language Resources and Evaluation - LREC 2006 tenutosi a Genova nel 22-28/5/2006).

POS tagset design for Italian

TAMBURINI, FABIO
2006

Abstract

We aim to automatically induce a PoS tagset for Italian by analysing the distributional behaviour of Italian words. To this end, we propose an algorithm that (a) extracts information from loosely labelled dependency structures that encode only basic and broadly accepted syntactic relations, namely Head/Dependent and the distinction of dependents into Argument vs. Adjunct, and (b) derives a possible set of word classes. The paper reports on some preliminary experiments carried out using the induced tagset in conjunction with state-of-the-art PoS taggers. The method proposed to design a proper tagset exploits little, if any, language-specific knowledge: hence it is in principle applicable to any language.
2006
Proceedings of 5th International Conference on Language Resources and Evaluation - LREC 2006
1396
1401
POS tagset design for Italian / Bernardi R.; Bolognesi A.; Seidenari C.; Tamburini F.. - STAMPA. - (2006), pp. 1396-1401. (Intervento presentato al convegno 5th International Conference on Language Resources and Evaluation - LREC 2006 tenutosi a Genova nel 22-28/5/2006).
Bernardi R.; Bolognesi A.; Seidenari C.; Tamburini F.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/29799
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact