Exploring sentence informativeness
Syrielle Montariol, Aina Garí Soler, Alexandre Allauzen
Abstract : This study is a preliminary exploration of the concept of informativeness –how much information a sentence gives about a word it contains– and its potential benefits to building quality word representations from scarce data. We propose several sentence-level classifiers to predict informativeness, and we perform a manual annotation on a set of sentences. We conclude that these two measures correspond to different notions of informativeness. However, our experiments show that using the classifiers’ predictions to train word embeddings has an impact on embedding quality.
Keywords : Informativeness, Word embeddings, Sentence classification, Data annotation.