Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/64355
Title: Text categorization based on sub-topic clusters
Authors: Chik, FCY
Luk, RWP 
Chung, FL 
Issue Date: 2005
Publisher: Springer
Source: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics), v. 3513, p. 203-214 How to cite?
Journal: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics) 
Abstract: The distribution of the number of documents in topic classes is typically highly skewed. This leads to good micro-average performance but not so desirable macro-average performance. By viewing topics as clusters in a high dimensional space, we propose the use of clustering to determine subtopic clusters for large topic classes by assuming that large topic clusters are in general a mixture of a number of subtopic clusters. We used the Reuters News articles and support vector machines to evaluate whether using subtopic cluster can lead to better macro-average performance.
Description: 10th International Conference on Applications of Natural Language to Information Systems, NLDB 2005, Alicante, Spain, June 15-17, 2005
URI: http://hdl.handle.net/10397/64355
ISBN: 978-3-540-26031-8 (print) 978-3-540-32110-1 (online)
ISSN: 0302-9743 (print)
1611-3349 (online)
DOI: 10.1007/11428817_19
Appears in Collections:Conference Paper

Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page view(s)

5
Last Week
0
Last month
Checked on May 21, 2017

Google ScholarTM

Check

Altmetric



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.