CRIS Current Research Information System

We present an evaluation framework for plagiarism detection.1 The framework provides performance measures that address the specifics of plagiarism detection, and the PAN-PC-10 corpus, which contains 64 558 artificial and 4 000 simulated plagiarism cases, the latter generated via Amazon's Mechanical Turk. We discuss the construction principles behind the measures and the corpus, and we compare the quality of our corpus to existing corpora. Our analysis gives empirical evidence that the construction of tailored training corpora for plagiarism detection can be automated, and hence be done on a large scale.

An evaluation framework for plagiarism detection

Potthast M.;Stein B.;Barron-Cedeno A.;Rosso P.

2010

Abstract

We present an evaluation framework for plagiarism detection.1 The framework provides performance measures that address the specifics of plagiarism detection, and the PAN-PC-10 corpus, which contains 64 558 artificial and 4 000 simulated plagiarism cases, the latter generated via Amazon's Mechanical Turk. We discuss the construction principles behind the measures and the corpus, and we compare the quality of our corpus to existing corpora. Our analysis gives empirical evidence that the construction of tailored training corpora for plagiarism detection can be automated, and hence be done on a large scale.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
			2010
		
	Titolo del volume
	
			Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference
		
	Pagina iniziale
	
			997
		
	Pagina finale
	
			1005
		
	Tutti gli autori
	
			Potthast M.; Stein B.; Barron-Cedeno A.; Rosso P.
		
	Appare nelle tipologie:
	
			4.01 Contributo in Atti di convegno

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/709279

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

266

ND

social impact