CRIS Current Research Information System

To plagiarise is to robe credit of another person's work. Particularly, plagiarism in text means including text fragments (and even an entire document) from an author without giving him the correspondent credit. In this work we describe our first attempt to detect plagiarised segments in a text employing statistical Language Models (LMs) and perplexity. The preliminary experiments, carried out on two specialised and literary corpora (including original, part-of-speech and stemmed versions), show that perplexity of a text segment, given a Language Model calculated over an author text, is a relevant feature in plagiarism detection.

Barron-Cedeno A., Rosso P. (2008). Towards the exploitation of statistical language models for plagiarism detection with reference.

Towards the exploitation of statistical language models for plagiarism detection with reference

Barron-Cedeno A.;Rosso P.

2008

Abstract

To plagiarise is to robe credit of another person's work. Particularly, plagiarism in text means including text fragments (and even an entire document) from an author without giving him the correspondent credit. In this work we describe our first attempt to detect plagiarised segments in a text employing statistical Language Models (LMs) and perplexity. The preliminary experiments, carried out on two specialised and literary corpora (including original, part-of-speech and stemmed versions), show that perplexity of a text segment, given a Language Model calculated over an author text, is a relevant feature in plagiarism detection.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2008
			
	Titolo del volume
	
				CEUR Workshop Proceedings
			
	Pagina iniziale
	
				15
			
	Pagina finale
	
				19
			
	Collana/Serie
	
				CEUR WORKSHOP PROCEEDINGS
			
	Citazione
	
				Barron-Cedeno A.,  Rosso P. (2008). Towards the exploitation of statistical language models for plagiarism detection with reference.
			
	Tutti gli autori
	
						Barron-Cedeno A.; Rosso P.
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
paper2.pdf accesso aperto Tipo: Versione (PDF) editoriale / Version Of Record Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 157.58 kB Formato Adobe PDF Visualizza/Apri	157.58 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/709318

Citazioni

ND

1

ND

ND

social impact