Automatic derivation of categorial grammar from a part-of-speech-tagged corpus in Scottish Gaelic

Colin Batchelor

Abstract : We present a preliminary categorial grammar for Scottish Gaelic derived automatically from the University of Edinburgh’s Annotated Reference Corpus of Scottish Gaelic (ARCOSG), which contains over 80 000 tokens of part-of-speech-tagged text in multiple genres. We discuss our methods for deriving this grammar, the distinctive features of Scottish Gaelic and of the corpus, parsing CCG, and set out what is needed for a rigorous and systematic evaluation of the work presented here.

Keywords : Scottish Gaelic, categorial grammar, CCG.