Click here for full text:
Unlabeled Data Can Degrade Classification Performance of Generative Classifiers
Cozman, Fabio G.; Cohen, Ira
HPL-2001-234
Keyword(s): semi-supervised learning; labeled and unlabeled data problem; classification; maximum-likelihood estimation; EM algorithm
Abstract: This report analyzes the effect of unlabeled training data in generative classifiers. We are interested in classification performance when unlabeled data are added to an existing pool of labeled data. We show that there are situations where unlabeled data can degrade the performance of a classifier. We present an analysis of these situations and explain several seemingly disparate results in the literature.
16 Pages
Back to Index
|