Interpreting corpora serve as the descriptive foundation of research and the ‘ground truth’ against which machine interpreting technologies are evaluated. However, access to corpora remains a critical bottleneck in interpreting studies due to data collection and processing challenges and the absence of interpreting- and translation-specific corpus publication venues. In this article, we present two technical infrastructures that facilitate corpus access: a metadata schema which standardises corpus description and the Unified Interpreting Corpus (UNIC) platform for data and metadata search and publication. Guided by the internationally established FAIR (findability, accessibility, interoperability and reusability) and CARE (collective benefit, authority to control, responsibility and ethics) principles for scientific data management and stewardship, we designed the infrastructures based on a review of 125 spoken and signed language interpreting corpora, relevant international standards and community knowledge and also by using open-source technologies. Feedback obtained from interpreting students, researchers and interpreters demonstrates greater perceived usefulness of and satisfaction with UNIC compared to general-purpose search portals. Overall, we illustrate a value- and consensus-driven path towards optimising the use of interpreting corpora and the careful curation of new ones, which avoids the duplication of effort, helps to chart research directions and fosters co-design with communities.

Liu, N., Russo, M. (2025). A value-sensitive metadata schema for interpreting corpora: Implementation on the Unified Interpreting Corpus (UNIC) platform. INTERPRETING, 27(2), 157-196 [10.1075/intp.00123.liu].

A value-sensitive metadata schema for interpreting corpora: Implementation on the Unified Interpreting Corpus (UNIC) platform

Liu, N.
;
Russo, M.
2025

Abstract

Interpreting corpora serve as the descriptive foundation of research and the ‘ground truth’ against which machine interpreting technologies are evaluated. However, access to corpora remains a critical bottleneck in interpreting studies due to data collection and processing challenges and the absence of interpreting- and translation-specific corpus publication venues. In this article, we present two technical infrastructures that facilitate corpus access: a metadata schema which standardises corpus description and the Unified Interpreting Corpus (UNIC) platform for data and metadata search and publication. Guided by the internationally established FAIR (findability, accessibility, interoperability and reusability) and CARE (collective benefit, authority to control, responsibility and ethics) principles for scientific data management and stewardship, we designed the infrastructures based on a review of 125 spoken and signed language interpreting corpora, relevant international standards and community knowledge and also by using open-source technologies. Feedback obtained from interpreting students, researchers and interpreters demonstrates greater perceived usefulness of and satisfaction with UNIC compared to general-purpose search portals. Overall, we illustrate a value- and consensus-driven path towards optimising the use of interpreting corpora and the careful curation of new ones, which avoids the duplication of effort, helps to chart research directions and fosters co-design with communities.
2025
Liu, N., Russo, M. (2025). A value-sensitive metadata schema for interpreting corpora: Implementation on the Unified Interpreting Corpus (UNIC) platform. INTERPRETING, 27(2), 157-196 [10.1075/intp.00123.liu].
Liu, N.; Russo, M.
File in questo prodotto:
File Dimensione Formato  
Liu&Russo_A value-sensitive metadata schema for interpreting corpora.pdf

accesso aperto

Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 2.23 MB
Formato Adobe PDF
2.23 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1013497
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact