Logo image
Training an NMT system for legal texts of a low-resource language variety South Tyrolean German - Italian
Conference poster   Open access

Training an NMT system for legal texts of a low-resource language variety South Tyrolean German - Italian

A Oliver, S Alvarez-Vidal, Egon Waldemar Stemle and Elena Chiocchetti
EAMT2024 (The 25th Annual Conference of The European Association for Machine Translation) (Sheffield, 24/06/2024 - 27/06/2024)
2024
Handle:
https://hdl.handle.net/10863/43173

Abstract

In this presentation/poster the process of training and evaluating NMT systems for a language pair including a low-resource language variety is presented. A parallel corpus for this language pair in the domain of legal texts has been compiled. As the size of the compiled corpus is not enough for the training, we have combined this corpus with several parallel corpora using data weighting at sentence level. An evaluation of each combination and of two popular commercial systems was carried out.
pdf
eamt_2024_EURAC_ita_deu_presentation135.42 kBDownloadView
Open Access
pdf
eamt_2024_EURAC_ita_deu_poster180.60 kBDownloadView
Open Access
url
https://eamt2024.sheffield.ac.uk/View

Details

Metrics

1 Record Views