|
Click here for full text:
New Frontiers For An Artificial Immune System
Greensmith, Julie
HPL-2003-204
Keyword(s): artificial immune system; document classification; feature vectors; AIRS
Abstract: AIRS, a resource limited artificial immune classifier system, has performed well on various classification tasks, including data clustering. This thesis proposes the use of this system for the complex task of multi- class document classification. Initially the AIRS system is validated using a standard machine learning dataset, which has not been used previously with this classifier. The use of AIRS for the purpose of document classification was then examined. This includes the pre-processing of HTML documents and the extraction, selection and representation of features, for the purpose of feature vector compilation. AIRS was used to classify various Internet documents, using a variety of datasets. Comparisons were made where the amount of documents, amount of classes and amount of features were varied independently. Additionally, AIRS was compared with another text classification package as a benchmarking exercise. On completion of this we are confident that AIRS is a suitable candidate for increasingly more complex tasks such as hierarchical document classification and multiple taxonomic mappings.
71 Pages
Back to Index
|