talnarchives

Une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue.

Adapted Sentiment Similarity Seed Words For French Tweets’ Polarity Classification

Amal Htait

Abstract : We present, in this paper, our contribution in DEFT 2018 task 2 : "Global polarity", determining the overall polarity (Positive, Negative, Neutral or MixPosNeg) of tweets regarding public transport, in French language. Our system is based on a list of sentiment seed-words adapted for French public transport tweets. These seed-words are extracted from DEFT’s training annotated dataset, and the sentiment relations between seed-words and other terms are captured by cosine measure of their word embeddings representations, using a French language word embeddings model of 683k words. Our semi-supervised system achieved an F1-measure equals to 0.64.

Keywords : Seed-words, Twitter, Similarity Measures, Word Embeddings, Word2vec. M OTS - CLÉS: Mots-graines, Twitter, Mesure de la Similarité, Plongement de mot, Word2vec.