We approach the task of assessing the suitability of a source text for translation by transferring the knowledge from established MT evaluation metrics to a model able to predict MT quality a priori from the source text alone. To open the door to experiments in this regard, we depart from reference English-German parallel corpora to build a corpus of 14,253 source text-quality score tuples. The tuples include four state-of-the-art metrics: cushLEPOR, BERTScore, COMET, and TransQuest. With this new resource at hand, we fine-tune XLM-RoBERTa, both in a single-task and a multi-task setting, to predict these evaluation scores from the source text alone. Results for this methodology are promising, with the single-task model able to approximate well-established MT evaluation and quality estimation metrics - without looking at the actual machine translations - achieving low RMSE values in the [0.1-0.2] range and Pearson correlation scores up to 0.688.

Return to the Source: Assessing Machine Translation Suitability / Fernicola Francesco, Bernardini Silvia, Garcea Federico, Ferraresi Adriano, Barrón-Cedeño Alberto. - ELETTRONICO. - (2023), pp. 79-89. (Intervento presentato al convegno 24th Annual Conference of the European Association for Machine Translation tenutosi a Tampere, Finland nel 2023).

Return to the Source: Assessing Machine Translation Suitability

Fernicola Francesco
Primo
;
Bernardini Silvia
Secondo
;
Garcea Federico;Ferraresi Adriano;Barrón-Cedeño Alberto
Ultimo
2023

Abstract

We approach the task of assessing the suitability of a source text for translation by transferring the knowledge from established MT evaluation metrics to a model able to predict MT quality a priori from the source text alone. To open the door to experiments in this regard, we depart from reference English-German parallel corpora to build a corpus of 14,253 source text-quality score tuples. The tuples include four state-of-the-art metrics: cushLEPOR, BERTScore, COMET, and TransQuest. With this new resource at hand, we fine-tune XLM-RoBERTa, both in a single-task and a multi-task setting, to predict these evaluation scores from the source text alone. Results for this methodology are promising, with the single-task model able to approximate well-established MT evaluation and quality estimation metrics - without looking at the actual machine translations - achieving low RMSE values in the [0.1-0.2] range and Pearson correlation scores up to 0.688.
2023
Proceedings of the 24th Annual Conference of the European Association for Machine Translation
79
89
Return to the Source: Assessing Machine Translation Suitability / Fernicola Francesco, Bernardini Silvia, Garcea Federico, Ferraresi Adriano, Barrón-Cedeño Alberto. - ELETTRONICO. - (2023), pp. 79-89. (Intervento presentato al convegno 24th Annual Conference of the European Association for Machine Translation tenutosi a Tampere, Finland nel 2023).
Fernicola Francesco, Bernardini Silvia, Garcea Federico, Ferraresi Adriano, Barrón-Cedeño Alberto
File in questo prodotto:
File Dimensione Formato  
2023.eamt-1.9.pdf

accesso aperto

Tipo: Versione (PDF) editoriale
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non opere derivate (CCBYND)
Dimensione 267.7 kB
Formato Adobe PDF
267.7 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/953418
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact