Now showing items 21-30 of 30
The PAISÀ Corpus of Italian Web Texts
(Association for Computational Linguistics, 2014)
PAISÀ is a Creative Commons licensed, large web corpus of contemporary Italian. We describe the design, harvesting, and processing steps involved in its creation.
Correcting OCR errors for German in Fraktur font
In this paper, we present ongoing experiments for correcting OCR errors on German newspapers in Fraktur font. Our approach borrows from techniques for spelling correction in context using a probabilistic edit-operation ...
Community involvement for transcribing historical correspondences of South Tyrolean interest: A DI-ÖSS use case
In this paper, we present a local research and Citizen Science initiative for the enrichment and analysis of handwritten historical postcards and letters by means of crowdsourcing. The documents are authentic communications ...
Building a Digital Infrastructure in South Tyrol
With this article we present the DI-ÖSS project, a local infrastructure initiative for South Tyrol, which aims at connecting institutions and organizations that are working with language data. The digital infrastructure ...
DI-ÖSS: Building a digital infrastructure in South Tyrol
This paper presents the DI-ÖSS project, a local digital infrastructure initiative for South Tyrol, which aims at connecting institutions and organizations that are working with language data. It shall serve to facilitate ...
EnetCollect in Italy
Sprachenlernen und Crowdsourcing: ein innovatives Projekt
In dem Beitrag wird das Europäische Netzwerk zur Zusammenführung von Sprachenlernen und Crowdsourcing-Techniken (enetCollect) vorgestellt, welches beginnend mit 2017 für vier Jahre als COST Action gefördert wird. Zunächst ...
Transc&Anno: A Graphical Tool for the Transcription and On-the-Fly Annotation of Handwritten Documents
We present Transc&Anno, a web-based collaboration tool allowing the transcription of text images and their shallow on-the-fly annotation. Transc&Anno was originally developed in order to address the needs of learner ...