Open demands for corpus analysis tools - a user-centered study

Verena Ruth Hilde Lyding

Back

Open demands for corpus analysis tools - a user-centered study

Dissertation

Open access

Open demands for corpus analysis tools - a user-centered study

Verena Ruth Hilde Lyding

Doctor of Philosophy (PHD), Friedrich-Alexander-Universität Erlangen-Nürnberg

17/02/2022

Handle:

https://hdl.handle.net/10863/25020

Abstract

corpus analysis tools

corpus linguistics

user-centered study

It was not before the advent of powerful computers that corpus linguistics has developed into a widely applied research methodology. Indeed, corpus linguistics heavily relies on computer-powered analysis tools. They get used on a daily basis by corpus linguists to retrieve examples and analyze authentic data from corpora of extensive sizes. Despite their indisputable importance, repetitive remarks highlight the fact that corpus analysis tools have evolved little since their early days. Concordances, frequency lists and collocation extraction still constitute the core functionalities of most corpus tools. With the aim to incentivize new functional developments, this thesis presents research on open demands in current corpus research practice and related requirements for tools support. It builds on the assumption that more user-centered research is needed to bridge the gap between mainly computationally trained tool developers and their linguistic expert users, who come with specialized domain knowledge and often sophisticated analytical needs. The research is approached by means of three user investigations that enquire about corpus research workflows and analysis activities as well as theoretical principles and methodological considerations in corpus linguistics research practice. This way a comprehensive picture of the corpus usage situation is assembled by combining insights from open ended enquiries (interviews) with quantitative results on selected aspects of the corpus analysis scenario (questionnaire) derived from enquiries with overall more than 100 corpus users. Based on the results, a range of open demands for corpus research and tools are identified and discussed. They relate to (1) corpus resources, (2) general aspects of tools, (3) corpus analysis procedures, and (4) best practices. The results show that open demands address challenges on very different operational levels, ranging from the availability of corpus resources and reliable annotations, technical requirements related to scalability and interoperability issues, usability and technical and methodological skills up to proper functional demands. The thesis discusses potential paths to address the open demands, and provides pointers to recent developments in corpus linguistics and related fields, in particular computational linguistics and natural language processing as well as linguistic information visualization. The research contribution of this thesis is twofold. On the methodological level, it elaborates on methods and challenges for user-centered research on tools for open-ended tasks and provides entrance points for further user-centered research by identifying and organizing, as reference, the basic building blocks of corpus linguistics research. On the content level, it provides first insights on user perspectives and needs related to the corpus research practice. It describes concrete demands and discusses paths to their solution. This way, it prepares the ground for further in-depths studies and user-centered developments of new corpus functionalities for specific demands.

Files and links (1)

pdf

DissertationVerenaLyding11.69 MBDownload View

Open Access

Details

Title: Open demands for corpus analysis tools - a user-centered study
Creators: Verena Ruth Hilde Lyding
Contributors: Stephanie Evert (Supervisor)
Awarding Institution: Friedrich-Alexander-Universität Erlangen-Nürnberg
Doctor of Philosophy (PHD)
Theses and Dissertations: Doctor of Philosophy (PHD), Friedrich-Alexander-Universität Erlangen-Nürnberg
Publisher: Erlangen
Identifiers: (EURAC)25860050
991006431496301241
Copyright: Keine Creative Commons Lizenz - es gilt der Veröffentlichungsvertrag und das deutsche Urheberrecht. Die Universitätsbibliothek Erlangen-Nürnberg ist laut Veröffentlichungsvertrag berechtigt, gemäß den darin genannten Bedingungen das Werk zur Nutzung im Internet bereitzustellen. Im Rahmen dieser Bereitstellung sind Nutzerinnen und Nutzer berechtigt, Dokumente nach Maßgabe des Urheberrechtsgesetzes unentgeltlich zu nutzen, insbesondere, das Dokument zum privaten und sonstigen eigenen Gebrauch herunter zu laden, zu speichern oder in kleiner Anzahl zu drucken (§ 53 UrhG). Für weitere Rechten und Pflichten der Vertragspartner siehe http://www.ub.fau.de/opus/veroeffentlichungsvertrag.pdf.
Academic Unit: Institute for Applied Linguistics
Language: English
Resource Type: Dissertation
Description audience: Scientific
Local Fields: Scientific
Author Names String: Lyding V
Supervisor(s): Prof. Dr. Stephanie Evert
Academic Unit: Philosophische Fakultät und Fachbereich Theologie

Metrics

6 File views/ downloads

12 Record Views