Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/20395
Title: Web classification of conceptual entities using co-training
Authors: Sun, A
Liu, Y
Lim, EP
Keywords: Co-training
Conceptual web classification
Web classification
Issue Date: 2011
Publisher: Pergamon Press
Source: Expert systems with applications, 2011, v. 38, no. 12, p. 14367-14375 How to cite?
Journal: Expert systems with applications 
Abstract: Social networking websites, which profile objects with predefined attributes and their relationships, often rely heavily on their users to contribute the required information. We, however, have observed that many web pages are actually created collectively according to the composition of some physical or abstract entity, e.g., company, people, and event. Furthermore, users often like to organize pages into conceptual categories for better search and retrieval, making it feasible to extract relevant attributes and relationships from the web. Given a set of entities each consisting of a set of web pages, we name the task of assigning pages to the corresponding conceptual categories conceptual web classification. To address this, we propose an entity-based co-training (EcT) algorithm which learns from the unlabeled examples to boost its performance. Different from existing co-training algorithms, EcT has taken into account the entity semantics hidden in web pages and requires no prior knowledge about the underlying class distribution which is crucial in standard co-training algorithms used in web classification. In our experiments, we evaluated EcT, standard co-training, and other three non co-training learning methods on Conf-425 dataset. Both EcT and co-training performed well when compared to the baseline methods that required large amount of training examples.
URI: http://hdl.handle.net/10397/20395
ISSN: 0957-4174
DOI: 10.1016/j.eswa.2011.03.010
Appears in Collections:Journal/Magazine Article

Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

8
Last Week
0
Last month
0
Citations as of Jun 26, 2017

WEB OF SCIENCETM
Citations

5
Last Week
0
Last month
0
Citations as of Jun 23, 2017

Page view(s)

23
Last Week
0
Last month
Checked on Jun 25, 2017

Google ScholarTM

Check

Altmetric



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.