CorDis is a large, XML, TEI-conformant, POS-tagged, multimodal, multigenre corpus representing a significant portion of the political and media discourse on the 2003 Iraqi conflict. It was generated from different sub-corpora which had been assembled by various research groups, ranging from official transcripts of Parliamentary sessions, both in the US and the UK, to the transcripts of the Hutton Inquiry, from American and British newspaper coverage of the conflict to White House press briefings and to transcriptions of American and British TV news programmes. The heterogeneity of the data, the specificity of the genres and the diverse discourse analytical purposes of different groups had led to a wide range of coding strategies being employed to make textual and meta-textual information retrievable. The main purpose of this paper is to show the process of harmonisation and integration whereby a loose collection of texts has become a stable architecture.

The CorDis Corpus: Mark-up and related issues.

Marchi Anna
Co-primo
;
2007

Abstract

CorDis is a large, XML, TEI-conformant, POS-tagged, multimodal, multigenre corpus representing a significant portion of the political and media discourse on the 2003 Iraqi conflict. It was generated from different sub-corpora which had been assembled by various research groups, ranging from official transcripts of Parliamentary sessions, both in the US and the UK, to the transcripts of the Hutton Inquiry, from American and British newspaper coverage of the conflict to White House press briefings and to transcriptions of American and British TV news programmes. The heterogeneity of the data, the specificity of the genres and the diverse discourse analytical purposes of different groups had led to a wide range of coding strategies being employed to make textual and meta-textual information retrievable. The main purpose of this paper is to show the process of harmonisation and integration whereby a loose collection of texts has become a stable architecture.
Proceedings from the Corpus Linguistics Conference Series
1
10
Marchi Anna; Cirillo Letizia; Venuti Marco
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/792273
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact