Logo image
Meta-Evaluation of automatic Machine Translation Metrics between Italian and a minor Language Variety of German
Conference proceeding   Open access   Peer reviewed

Meta-Evaluation of automatic Machine Translation Metrics between Italian and a minor Language Variety of German

Paolo Di Natale and E Stemle
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025), pp.371-383
Eleventh Italian Conference on Computational Linguistics (Cagliari, 24/09/2025–26/09/2025)
2025
Handle:
https://hdl.handle.net/10863/50227

Abstract

automatic machine translation evaluation metrics metrics meta-evaluation non-English language combination minor language variety machine translation natural language generation evaluation specialized communication
We present the first meta-evaluation of Automatic Machine Translation Evaluation (AMTE) metrics between Italian and South Tyrolean German, a low-resourced standard variety of German. This minor German variety is recognised as a co-official language at the local level and is used by the local public administration and legislature. We evaluate metric agreement with human judgement across translation quality levels, using a dataset of bilingual machine-translated decrees annotated with human-curated error tags. Our findings show that embedding-based metrics perform best for evaluating high-quality translations, while learned neural metrics correlate more strongly with human judgments on lower-quality ranges. We also expose a persistent bias in AMTE against minor language varieties and make suggestions about the design of linguistic resources for envisaged custom metric devolopment
pdf
2025.clicit-1.381.19 MBDownloadView
Open Access
url
https://aclanthology.org/2025.clicit-1/View
url
https://aclanthology.org/2025.clicit-1.38/View

Details

Metrics

1 Record Views
Logo image