This paper presents an ongoing research concerning the annotation of large corpora with morphological information. It aims at providing a general schema for inserting rich morphological information to enable complex corpus queries of word internal structure. Annotating real corpus data presents challenges that can hardly be managed with traditional linear analysis of word structure, but can efficiently and correctly be handled with different, more complex, structures. For this reason, we propose Derivation Graphs as a new tool for representing the structure of complex words, and we discuss the theoretical consequences of this choice on the representation of affixes, a crucial issue for all morphological models.
Grandi N., Montermini F., Tamburini F. (2011). ANNOTATING LARGE CORPORA FOR STUDYING ITALIAN DERIVATIONAL MORPHOLOGY. LINGUE E LINGUAGGIO, X (2011)(2), 227-244 [10.1418/35841].
ANNOTATING LARGE CORPORA FOR STUDYING ITALIAN DERIVATIONAL MORPHOLOGY
GRANDI, NICOLA;TAMBURINI, FABIO
2011
Abstract
This paper presents an ongoing research concerning the annotation of large corpora with morphological information. It aims at providing a general schema for inserting rich morphological information to enable complex corpus queries of word internal structure. Annotating real corpus data presents challenges that can hardly be managed with traditional linear analysis of word structure, but can efficiently and correctly be handled with different, more complex, structures. For this reason, we propose Derivation Graphs as a new tool for representing the structure of complex words, and we discuss the theoretical consequences of this choice on the representation of affixes, a crucial issue for all morphological models.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.