talnarchives

Une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue.

Universal Dependencies for Irish

Teresa Lynn, Jennifer Foster

Abstract : Language resources that enable cross-lingual studies have become increasingly valuable for lesserresourced languages such as Irish, as they allow for easier sharing of resources, thus overcoming the problem of data scarcity. The Universal Dependencies (UD) Project1 is an initiative aimed at cross-lingual studies of treebanks, linguistic structures and parsing. Its goal is to create a set of multilingual harmonised treebanks that are designed according to a universal annotation scheme. In this paper, we report on the conversion of the Irish Dependency Treebank (IDT) (Lynn, 2016) to a UD version of the treebank which we term the Irish Universal Dependency Treebank (IUDT). We report on the mapping of the IDT labelling scheme to the UD scheme, along with a clear description of the structural changes required in this conversion.

Keywords : parsing, Irish, dependency treebank, universal dependencies, mapping, labels.