Comparing Named-Entity Recognizers in a Targeted Domain: Handcrafted Rules vs Machine Learning
Ioannis Partalas, Cédric Lopez, Frédérique Segond
Abstract : Comparing Named-Entity Recognizers in a Targeted Domain : Handcrafted Rules vs. Machine Learning Named-Entity Recognition concerns the classification of textual objects in a predefined set of categories such as persons, organizations, and localizations. While Named-Entity Recognition is well studied since 20 years, the application to specialized domains still poses challenges for current systems. We developed a rule-based system and two machine learning approaches to tackle the same task : recognition of product names, brand names, etc., in the domain of Cosmetics, for French. Our systems can thus be compared under ideal conditions. In this paper, we introduce both systems and we compare them.
Keywords : NER, e-Commerce, système à base de règles, système d’apprentissage.