Click here for full text:
The effect of unlabeled data on generative classifiers, with application to model selection
Cohen, Ira; Cozman, Fabio G.; Bronstein, Alexandre
HPL-2002-140
Keyword(s): semi-supervised learning; labeled and unlabeled data problem; classification; machine learning
Abstract: In this paper we investigate the effect of unlabeled data on generative classifiers in semi-supervised learning. We first characterize situations where unlabeled data cannot change estimates obtained with labeled data, and argue that such situations are unusual in practice. We then report on a large set of experiments involving labeled and unlabeled data, and demonstrate that unlabeled data can degrade classification performance when modeling assumptions are incorrect. To improve classification performance, we propose a method to switch assumed model structure based on the effect of unlabeled data.
16 Pages
Back to Index
|