Technical Reports

HPL-2010-166

Click here for full text: PDF

Fine Grained Classification of Named Entities In Wikipedia

Tkachenko, Maksim; Ulanov, Alexander; Simanovsky, Andrey
HP Laboratories

HPL-2010-166

Keyword(s): named entity recognition; Wikipedia; classification

Abstract: This report describes the study on classifying Wikipedia articles into an extended set of named entity classes. We employed semi-automatic method to extend Wikipedia class annotation and created a training set for 15 named entity classes. We implemented two classifiers. A binary named-entity classifier decides between articles about named entities and other articles. A support vector machine (SVM) classifier trained on a variety of Wikipedia features determines the class of a named entity. Combination of the two classifiers helped us to boost classification quality and obtain classification quality that is better than state of the art.

10 Pages

External Posting Date: October 21, 2010 [Fulltext]. Approved for External Publication
Internal Posting Date: October 21, 2010 [Fulltext]

Back to Index