Investigating associative, switchable and negatable Winograd items on renewed French data sets
Xiaoou Wang, Olga Seminck, Pascal Amsili
Abstract : The Winograd Schema Challenge (WSC) consists of a set of anaphora resolution problems resolvable only by reasoning about world knowledge. This article describes the update of the existing French data set and the creation of three subsets allowing for a more robust, fine-grained evaluation protocol of WSC in French (FWSC) : an associative subset (items easily resolvable with lexical co-occurrence), a switchable subset (items where the inversion of two keywords reverses the answer) and a negatable subset (items where applying negation on its verb reverses the answer). Experiences on these data sets with CamemBERT reach SOTA performances. Our evaluation protocol showed in addition that the higher performance could be explained by the existence of associative items in FWSC. Besides, increasing the size of training corpus improves the model’s performance on switchable items while the impact of larger training corpus remains small on negatable items.
Keywords : Winograd Schema Challenge, world knowledge, commonsense reasoning, negation, French, CamemBERT.