Corpuses of large dimensions provide important and complete lexical information, but their analysis can become cumbersome, particularly for lexicographic purposes. Sub-corpuses of significantly smaller dimensions could be extracted from the original corpus and analyzed to overcome such limitations. However, an important aspect is to define which is the optimal dimension for these selected sub-corpuses in order to preserve the main features of the original corpus, both qualitatively and quantitatively. We show how statistical methodologies can help in determining theoptimal sample size. To corroborate our findings, we consider the corpus CREA (reference corpus of the current Spanish) and, as object of study, the adjective externoand its meanings. We show how the different meanings of this word are preserved and well-represented in a much smaller sub-corpus. This is shown for three different countries: Argentina, Spain and Mexico.

Corpus léxico y diccionario: la estricta representatividad estadística

Hugo E. lombardini;Silvia Bianconcini
2019

Abstract

Corpuses of large dimensions provide important and complete lexical information, but their analysis can become cumbersome, particularly for lexicographic purposes. Sub-corpuses of significantly smaller dimensions could be extracted from the original corpus and analyzed to overcome such limitations. However, an important aspect is to define which is the optimal dimension for these selected sub-corpuses in order to preserve the main features of the original corpus, both qualitatively and quantitatively. We show how statistical methodologies can help in determining theoptimal sample size. To corroborate our findings, we consider the corpus CREA (reference corpus of the current Spanish) and, as object of study, the adjective externoand its meanings. We show how the different meanings of this word are preserved and well-represented in a much smaller sub-corpus. This is shown for three different countries: Argentina, Spain and Mexico.
Hugo E. lombardini; Silvia Bianconcini
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/706969
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact