Please use this identifier to cite or link to this item:
Title: A novel class noise estimation method and application in classification
Authors: Gui, L
Lu, Q 
Xu, R
Li, M 
Wei, Q
Keywords: Class noise
Learning with noise
Noise elimination
Issue Date: 2015
Publisher: Association for Computing Machinery
Source: Proceedings of the 24th ACM International Conference on Information and Knowledge Management & co-located workshops October 19-23, 2015, Melbourne, Australia, p. 1081-1090 How to cite?
Abstract: Noise in class labels of any training set can lead to poor classification results no matter what machine learning method is used. In this paper, we first present the problem of binary classification in the presence of random noise on the class labels, which we call class noise. To model class noise, a class noise rate is normally defined as a small independent probability of the class labels being inverted on the whole set of training data. In this paper, we propose a method to estimate class noise rate at the level of individual samples in real data. Based on the estimation result, we propose two approaches to handle class noise. The first technique is based on modifying a given surrogate loss function. The second technique eliminates class noise by sampling. Furthermore, we prove that the optimal hypothesis on the noisy distribution can approximate the optimal hypothesis on the clean distribution using both approaches. Our methods achieve over 87% accuracy on a synthetic non-separable dataset even when 40% of the labels are inverted. Comparisons to other algorithms show that our methods outperform state-of-the-art approaches on several benchmark datasets in different domains with different noise rates.
ISBN: 9781450337946
DOI: 10.1145/2806416.2806554
Appears in Collections:Conference Paper

View full-text via PolyU eLinks SFX Query
Show full item record


Last Week
Last month
Citations as of Dec 7, 2018

Page view(s)

Last Week
Last month
Citations as of Dec 17, 2018

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.