The Indo-European Cognate Relationships dataset

C Anderson; M Scarborough; L Jocz; MJ Kümmel; T Jügel; B Irslinger; R Pooth; H Liljegren; RF Strand; G Haig; U Geupel; M Macak; RI Kim; E Anonby; T Pronk; O Belyaev; TK Dewey-Findell; M Boutilier; C Freiberg; R Tegethoff; M Serangeli; K Stroński; A Falileyev; N Liosis; K Schulte; G Kumar Gupta; R Izadifar; P Markus; N Williams; S Loi; N Sims-Williams; M Findell; S Adibifar; G Abete; P Atanasov; E Baiwir; MR Bastardas; A Benkato; LS Bevevino; E Buchi; G Cadorini; C Cathcart; L Cheveau; C Christodoulou; J Delorme; SN Dworkin; D Ekici; S Farridnejad; M Gheitasi; H Hammarström; S Hewitt; A Ali Khan; M Kamal Khan; L Khokhlova; D Kim; C Lewin; B Lushaj; P Mahmoudveysi; M Mahommadirad; S Mersch; B Mustafa; F Nemati; M Nourzaei; P Muircheartaigh; V Oogjen; M Ourang; H Pagan; TS Palmer; S Pepper; M Purandare; K Rehman; G Rhys; U Røyneland; MZ Sagar; JJ Sandstedt; L Steensland; M Taheri-Ardali; M Talebi-Dastenaei; S Tittel; T Tresoldi; M deVaan; A Verkerk; A Versloot; Paul Videsott; N Vuletić; M Widmer; A Zeini; HJ Bibiko; F Runge; RD Gray; P Heggarty

doi:10.1038/s41597-025-05445-3

Back

The Indo-European Cognate Relationships dataset

Journal article

Open access

Peer reviewed

The Indo-European Cognate Relationships dataset

C Anderson, M Scarborough, L Jocz, MJ Kümmel, T Jügel, B Irslinger, R Pooth, H Liljegren, RF Strand, G Haig, …

Scientific Data, (12), pp.1-27

2025

DOI: https://doi.org/10.1038/s41597-025-05445-3

Handle:

https://hdl.handle.net/10863/49391

PMID: 40897732

Abstract

The Indo-European Cognate Relationships (IE-CoR) dataset is an open-access relational dataset showing how related, inherited words (‘cognates’) pattern across 160 languages of the Indo-European family. IE-CoR is intended as a benchmark dataset for computational research into the evolution of the Indo-European languages. It is structured around 170 reference meanings in core lexicon, and contains 25731 lexeme entries, analysed into 4981 cognate sets. Novel, dedicated structures are used to code all known cases of horizontal transfer. All 13 main documented clades of Indo-European, and their main subclades, are well represented. Time calibration data for each language are also included, as are relevant geographical and social metadata. Data collection was performed by an expert consortium of 89 linguists drawing on 355 cited sources. The dataset is extendable to further languages and meanings and follows the Cross-Linguistic Data Format (CLDF) protocols for linguistic data. It is designed to be interoperable with other cross-linguistic datasets and catalogues, and provides a reference framework for similar initiatives for other language families.

Files and links (2)

pdf

s41597-025-05445-3Download View

Open Access

url

https://www.nature.com/articles/s41597-025-05445-3View

Details

Title: The Indo-European Cognate Relationships dataset
Creators: C Anderson - Max Planck Institute for Evolutionary Anthropology
M Scarborough - Statistics Denmark
L Jocz - The Jacob of Paradies University
MJ Kümmel - Friedrich Schiller University Jena
T Jügel - Ruhr University Bochum
B Irslinger - Saxon Academy of Sciences and Humanities in Leipzig
R Pooth - Ghent University
H Liljegren - Stockholm University
RF Strand - Film Independent
G Haig - University of Bamberg
U Geupel - Philipps University of Marburg
M Macak
RI Kim - Adam Mickiewicz University in Poznań
E Anonby - University of Bamberg
T Pronk - Leiden University
O Belyaev - Institute of Linguistics
TK Dewey-Findell - University of Nottingham
M Boutilier - University of Wisconsin–Madison
C Freiberg
R Tegethoff - Friedrich Schiller University Jena
M Serangeli - Friedrich Schiller University Jena
K Stroński - Adam Mickiewicz University in Poznań
A Falileyev - Universidad de Salamanca
N Liosis - Aristotle University of Thessaloniki
K Schulte - Universitat Jaume I
G Kumar Gupta
R Izadifar - Bu-Ali Sina University
P Markus - Adam Mickiewicz University in Poznań
N Williams - University College Dublin
S Loi - Leipzig University
N Sims-Williams - SOAS University of London
M Findell - University of Nottingham
S Adibifar - University of Bamberg
G Abete - University of Naples Federico II
P Atanasov - Institute of public health of Republic of Macedonia
E Baiwir - Université de Lille
MR Bastardas - Universitat de Barcelona
A Benkato - Berkeley College
LS Bevevino - University of Minnesota Morris
E Buchi - Université de Lorraine
G Cadorini - Masaryk University
C Cathcart - University of Zurich
L Cheveau - Université Rennes 2
C Christodoulou - University of Cyprus
J Delorme - Institut National des Langues et Civilisations Orientales
SN Dworkin - University of Michigan
D Ekici - Film Independent
S Farridnejad - Universität Hamburg
M Gheitasi - Ilam University
H Hammarström - Uppsala University
S Hewitt - Film Independent
A Ali Khan
M Kamal Khan
L Khokhlova - Institute for African Studies
D Kim - Film Independent
C Lewin - Sabhal Mòr Ostaig
B Lushaj - Radboud University Nijmegen
P Mahmoudveysi - Universität Hamburg
M Mahommadirad - Laboratoire d'études sur les monothéismes
S Mersch - University of Luxembourg
B Mustafa - University of Bamberg
F Nemati - Persian Gulf University
M Nourzaei - Uppsala University
P Muircheartaigh - University of Edinburgh
V Oogjen - Leiden University
M Ourang - UNSW Sydney
H Pagan - University of Westminster
TS Palmer - Leiden University
S Pepper - University of Oslo
M Purandare - Adam Mickiewicz University in Poznań
K Rehman - University of Azad Jammu and Kashmir
G Rhys
U Røyneland - University of Oslo
MZ Sagar - Development Initiatives
JJ Sandstedt - Volda University College
L Steensland - Film Independent
M Taheri-Ardali - Shahrekord University
M Talebi-Dastenaei - Alzahra University
S Tittel - Heidelberg Academy of Sciences and Humanities
T Tresoldi - Uppsala University
M deVaan
A Verkerk - Saarland University
A Versloot - Geneeskundige en Gezondheidsdienst
Paul Videsott - Free University of Bozen-Bolzano
N Vuletić - University of Zadar
M Widmer - Swiss National Museum
A Zeini - University of Oxford
HJ Bibiko - Max Planck Institute for Evolutionary Anthropology
F Runge - Film Independent
RD Gray - University of Auckland
P Heggarty - Pontificia Universidad Católica del Perú
Publication Details: Scientific Data, (12), pp.1-27
ISSN: 2052-4463
EISSN: 2052-4463
Publisher: Nature Research
Number of pages: 27
Identifiers: (UNIBZ)90899142
991007138406301241
Scopus ID: 2-s2.0-105014915657
Copyright: This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Academic Unit: Faculty of Education
Language: English
Resource Type: Journal article
Author Names String: Anderson C, Scarborough M, Jocz L, Kümmel MJ, Jügel T, Irslinger B, Pooth R, Liljegren H, Strand RF, Haig G, Geupel U, Macak M, Kim RI, Anonby E, Pronk T, Belyaev O, Dewey-Findell TK, Boutilier M, Freiberg C, Tegethoff R, Serangeli M, Stroński K, Falileyev A, Liosis N, Schulte K, Kumar Gupta G, Izadifar R, Markus P, Williams N, Loi S, Sims-Williams N, Findell M, Adibifar S, Abete G, Atanasov P, Baiwir E, Bastardas MR, Benkato A, Bevevino LS, Buchi E, Cadorini G, Cathcart C, Cheveau L, Christodoulou C, Delorme J, Dworkin SN, Ekici D, Farridnejad S, Gheitasi M, Hammarström H, Hewitt S, Ali Khan A, Kamal Khan M, Khokhlova L, Kim D, Lewin C, Lushaj B, Mahmoudveysi P, Mahommadirad M, Mersch S, Mustafa B, Nemati F, Nourzaei M, Muircheartaigh P, Oogjen V, Ourang M, Pagan H, Palmer TS, Pepper S, Purandare M, Rehman K, Rhys G, Røyneland U, Sagar MZ, Sandstedt JJ, Steensland L, Taheri-Ardali M, Talebi-Dastenaei M, Tittel S, Tresoldi T, deVaan M, Verkerk A, Versloot A, Videsott P, Vuletić N, Widmer M, Zeini A, Bibiko HJ, Runge F, Gray RD, Heggarty P

Metrics

8 Record Views