CRIS Current Research Information System

The aim of this paper is to give an ‘a-theoretical’ definition of the main parts of speech, extracting the set of categories from the actual distribution of data, or, in other words, from the contexts of occurrence of words. The definitions of the parts of speech obtained in this way depend uniquely on contextual information and on the analysis of distributional similarities among words, and are not conditioned by any theoretical framework. The research hypothesis is that two words which are formally and semantically similar and which share the same syntactic behavior will occur in similar contexts. As a consequence, if we classify words according to their contexts of occurrence, we should expect that formally and semantically similar words will turn up in the same class. So, if we investigate a huge, representative corpus of a language, we should be able to automatically extract all the parts of speech by means of a survey of the contexts of occurrences. In this article we will test this approach on Italian, basing our analysis on CORIS, a representative corpus of written Italian.

D'Errico, M., Grandi, N., Paternesi Melloni, S., Tamburini, F. (2016). INDUZIONE DI CATEGORIE GRAMMATICALI E LESSICALI. Roma : Il Calamo.

INDUZIONE DI CATEGORIE GRAMMATICALI E LESSICALI

D'Errico, M.;GRANDI, NICOLA;Paternesi Melloni, S.;TAMBURINI, FABIO

2016

Abstract

The aim of this paper is to give an ‘a-theoretical’ definition of the main parts of speech, extracting the set of categories from the actual distribution of data, or, in other words, from the contexts of occurrence of words. The definitions of the parts of speech obtained in this way depend uniquely on contextual information and on the analysis of distributional similarities among words, and are not conditioned by any theoretical framework. The research hypothesis is that two words which are formally and semantically similar and which share the same syntactic behavior will occur in similar contexts. As a consequence, if we classify words according to their contexts of occurrence, we should expect that formally and semantically similar words will turn up in the same class. So, if we investigate a huge, representative corpus of a language, we should be able to automatically extract all the parts of speech by means of a survey of the contexts of occurrences. In this article we will test this approach on Italian, basing our analysis on CORIS, a representative corpus of written Italian.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2016
			
	Titolo del volume
	
				Categorie grammaticali e classi di parole. Statuto e riflessi metalinguistici
			
	Pagina iniziale
	
				115
			
	Pagina finale
	
				137
			
	Collana/Serie
	
				LINGUE, LINGUAGGI, METALINGUAGGIO
			
	Citazione
	
				D'Errico, M., Grandi, N., Paternesi Melloni, S., Tamburini, F. (2016). INDUZIONE DI CATEGORIE GRAMMATICALI E LESSICALI. Roma : Il Calamo.
			
	Tutti gli autori
	
						D'Errico, M.; Grandi, N.; Paternesi Melloni, S.; Tamburini, F.
					
	Appare nelle tipologie:
	
				2.01 Capitolo / saggio in libro

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/574846

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

ND

social impact