In this paper, we discuss the application of a sense induction procedure to data from CORIS, a well balanced reference corpus of Italian. The method considered discriminates between the different senses of a word by analysing the relationships between its collocates and suggesting collocate clusters, each of which corresponds to one sense of a word. The collocate clusters are represented as 3D-graphs in a semantic space. We show that for some examples the method can satisfactorily induce the senses of the chosen node; however, we also show that for some controversial instances human interpretation of the results is needed. We thus conclude that, although powerful, automated systems still require human knowledge both for the analysis and the interpretation of language phenomena, and that an integration of the two methodologies is desirable.
Rossini Favretti R., Tamburini F., Zaninello A. (2011). Exploiting corpus evidence for automatic sense induction. València : Editorial Universitat Politècnica de València.
Exploiting corpus evidence for automatic sense induction
ROSSINI, REMA;TAMBURINI, FABIO;
2011
Abstract
In this paper, we discuss the application of a sense induction procedure to data from CORIS, a well balanced reference corpus of Italian. The method considered discriminates between the different senses of a word by analysing the relationships between its collocates and suggesting collocate clusters, each of which corresponds to one sense of a word. The collocate clusters are represented as 3D-graphs in a semantic space. We show that for some examples the method can satisfactorily induce the senses of the chosen node; however, we also show that for some controversial instances human interpretation of the results is needed. We thus conclude that, although powerful, automated systems still require human knowledge both for the analysis and the interpretation of language phenomena, and that an integration of the two methodologies is desirable.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.