Hits:
Indexed by:期刊论文
Date of Publication:2009-06-01
Journal:Journal of Computational Information Systems
Included Journals:EI、Scopus
Volume:5
Issue:3
Page Number:1373-1378
ISSN No.:15539105
Abstract:Self-Training algorithm is a semi-supervised classification algorithm which through repeated training with the labeled data to get a enlarged labeled data set and improve the classification accuracy meanwhile. Since the initial labeled data set in Self-Training algorithm may be small, a considerable number of data are mislabeled in the training process is unavoidable. A nearest neighbor rule based data editing technique is introduced, which extends traditional self-training algorithm by new methods of identifying and removing the mislabeled data, so that it can reduce the mislabeled data and improve the classification accuracy. The data sets used in experiments are all from the UCI machine repository. The classification effect is improved in different levels through contrast. The experimental results show that the introduction of the data editing technique is beneficial for improving the classification effect of Self-Training. 1553-9105/ Copyright ? 2009 Binary Information Press.