Training an NMT system for legal texts of a low-resource language variety South Tyrolean German - Italian

A Oliver; S Alvarez-Vidal; Egon Waldemar Stemle; Elena Chiocchetti

Back

Training an NMT system for legal texts of a low-resource language variety South Tyrolean German - Italian

Conference poster

Open access

Training an NMT system for legal texts of a low-resource language variety South Tyrolean German - Italian

A Oliver, S Alvarez-Vidal, Egon Waldemar Stemle and Elena Chiocchetti

EAMT2024 (The 25th Annual Conference of The European Association for Machine Translation) (Sheffield, 24/06/2024 - 27/06/2024)

2024

Handle:

https://hdl.handle.net/10863/43173

Abstract

In this presentation/poster the process of training and evaluating NMT systems for a language pair including a low-resource language variety is presented. A parallel corpus for this language pair in the domain of legal texts has been compiled. As the size of the compiled corpus is not enough for the training, we have combined this corpus with several parallel corpora using data weighting at sentence level. An evaluation of each combination and of two popular commercial systems was carried out.

Files and links (3)

pdf

eamt_2024_EURAC_ita_deu_presentation135.42 kBDownload View

Open Access

pdf

eamt_2024_EURAC_ita_deu_poster180.60 kBDownload View

Open Access

url

https://eamt2024.sheffield.ac.uk/View

Details

Title: Training an NMT system for legal texts of a low-resource language variety South Tyrolean German - Italian
Creators: A Oliver
S Alvarez-Vidal
Egon Waldemar Stemle
Elena Chiocchetti
Conference: EAMT2024 (The 25th Annual Conference of The European Association for Machine Translation) (Sheffield, 24/06/2024 - 27/06/2024)
Identifiers: (EURAC)28593517
991006856297901241
Academic Unit: Institute for Applied Linguistics
Language: English
Resource Type: Conference poster
Description coverage: international
Description audience: Scientific
Local Fields: Scientific
Author Names String: Oliver A, Alvarez-Vidal S, Stemle EW, Chiocchetti E

Metrics

1 Record Views