This contribution has a double aim. On the one hand, it highlights the various challenges and problems compilers of (simultaneous) interpreting and intermodal corpora are likely to face, and the solutions that were found and applied in three corpora of European Parliament plenary debates, i.e. EPIC, EPICG and EPTIC. On the other, it provides an accessible step-by-step guide for corpus developers who are working with European Parliament data, with the ultimate aim of bringing as far as possible into line the procedures used to transcribe the audio tracks, record metadata, annotate texts with part-of-speech and lemma information, perform text-to-text and text-to-audio/video alignment, and index the corpus for searching with appropriate corpus query tools. By adopting shared corpus building methods, it might be possible to leverage the substantial efforts already deployed by different research groups, and encourage others to take charge of new language pairs. This, we shall argue, might lead to a massively multilingual interpreting and intermodal corpus, through a distributed community effort.

Building Interpreting and Intermodal Corpora: A How to for a Formidable Task

Bernardini, S.
Conceptualization
;
Ferraresi, A.
Conceptualization
;
Russo, M.
Conceptualization
;
2018

Abstract

This contribution has a double aim. On the one hand, it highlights the various challenges and problems compilers of (simultaneous) interpreting and intermodal corpora are likely to face, and the solutions that were found and applied in three corpora of European Parliament plenary debates, i.e. EPIC, EPICG and EPTIC. On the other, it provides an accessible step-by-step guide for corpus developers who are working with European Parliament data, with the ultimate aim of bringing as far as possible into line the procedures used to transcribe the audio tracks, record metadata, annotate texts with part-of-speech and lemma information, perform text-to-text and text-to-audio/video alignment, and index the corpus for searching with appropriate corpus query tools. By adopting shared corpus building methods, it might be possible to leverage the substantial efforts already deployed by different research groups, and encourage others to take charge of new language pairs. This, we shall argue, might lead to a massively multilingual interpreting and intermodal corpus, through a distributed community effort.
Making Way in Corpus-based Interpreting Studies
21
42
Bernardini, S.; Ferraresi, A.;Russo, M.; Collard, C.; B. Defrancq
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/621904
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 12
social impact