CRIS Current Research Information System

In this paper we examine the definitions of two widely-used interrelated constructs in corpus linguistics, keyness and keywords, as presented in the literature and corpussoftware manuals. In particular, we focus on • the consistency of definitions given in different sources; • the metrics used to calculate the level of keyness; • the compatibility between definitions and metrics Our survey of studies employing keyword analysis has indicated that the vast majority of studies examine a subset of keywords – almost always the top X number of keywords as ranked by the metric used. This renders the issue of the appropriate metric central to any study using keyword analysis.In this study, we first argue that an appropriate, and therefore useful, metric for keyness needs to be fully consistent with the definition of keyword. We then use four sets of comparisons between corpora of different types and sizes, in order to test whether and to what extent the use of different metrics affects the ranking of keywords. More precisely, we look at the extent of overlap in the keyword rankings resulting from the adoption of different metrics, and we discuss the implications of ranking-based analysis adopting one metric or another. Finally, we propose a new metric for keyness , and demonstrate a simple way to calculate the metric, which supplements the keyword extraction in existing corpus software.

Gabrielatos C, Marchi A (2012). Keyness: Appropriate metrics and practical issues.

Keyness: Appropriate metrics and practical issues

Marchi A^Secondo

2012

Abstract

In this paper we examine the definitions of two widely-used interrelated constructs in corpus linguistics, keyness and keywords, as presented in the literature and corpussoftware manuals. In particular, we focus on • the consistency of definitions given in different sources; • the metrics used to calculate the level of keyness; • the compatibility between definitions and metrics Our survey of studies employing keyword analysis has indicated that the vast majority of studies examine a subset of keywords – almost always the top X number of keywords as ranked by the metric used. This renders the issue of the appropriate metric central to any study using keyword analysis.In this study, we first argue that an appropriate, and therefore useful, metric for keyness needs to be fully consistent with the definition of keyword. We then use four sets of comparisons between corpora of different types and sizes, in order to test whether and to what extent the use of different metrics affects the ranking of keywords. More precisely, we look at the extent of overlap in the keyword rankings resulting from the adoption of different metrics, and we discuss the implications of ranking-based analysis adopting one metric or another. Finally, we propose a new metric for keyness , and demonstrate a simple way to calculate the metric, which supplements the keyword extraction in existing corpus software.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2012
			
	Titolo del volume
	
				Corpus-Assisted Discourse Studies More than the sum of Discourse Analysis and comuputing?
			
	Pagina iniziale
	
				19
			
	Pagina finale
	
				19
			
	Citazione
	
				Gabrielatos C,  Marchi A (2012). Keyness: Appropriate metrics and practical issues.
			
	Tutti gli autori
	
						Gabrielatos C; Marchi A
					
	Appare nelle tipologie:
	
				4.02 Riassunto (Abstract)

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/793789

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

ND

social impact