Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/39817
Title: Combining classification with clustering for web person disambiguation
Authors: Xu, J
Lu, Q 
Liu, Z
Keywords: Key phrase
String kernel
SVM
Web person disambiguation
Issue Date: 2012
Source: WWW '12 Companion Proceedings of the 21st International Conference on World Wide Web, Lyon, France, April 13-17, 2012, ACM, Lyon, France, p. 637-638 How to cite?
Abstract: Web Person Disambiguation is often conducted through clustering web documents to identify different namesakes for a given name. This paper presents a new key-phrased clustering method combined with a second step re-classification to identify outliers to improve cluster performance. For document clustering, the hierarchical agglomerative approach is conducted based on the vector space model which uses key phrases as the main feature. Outliers of cluster results are then identified through a centroids-based method. The outliers are then reclassified by the SVM classifier into the more appropriate clusters using a key phrase-based string kernel model as its feature space. The re-classification uses the clustering result in the first step as its training data so as to avoid the use of separate training data required by most classification algorithms. Experiments conducted on the WePS-2 dataset show that the algorithm based on key phrases is effective in improving the WPD performance.
URI: http://hdl.handle.net/10397/39817
ISBN: 978-1-4503-1230-1
DOI: 10.1145/2187980.2188165
Appears in Collections:Conference Paper

Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

5
Last Week
2
Last month
Citations as of Oct 17, 2017

Page view(s)

34
Last Week
1
Last month
Checked on Oct 16, 2017

Google ScholarTM

Check

Altmetric



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.