Please use this identifier to cite or link to this item:
Title: A rough set-based CBR approach for feature and document reduction in text categorization
Authors: Li, Y
Shiu, SCK 
Pal, SK
Liu, JNK
Keywords: Case-based reasoning
Natural languages
Rough set theory
Word processing
Issue Date: 2004
Publisher: IEEE
Source: Proceedings of 2004 International Conference on Machine Learning and Cybernetics, 2004, 26-29 August 2004, v. 4, p. 2438-2443 How to cite?
Abstract: An approach of rough set-based case-based reasoning (CBR) approach is proposed to tackle the task of text categorization (TC). The initial work of integrating both feature and document reduction/selection in TC using rough sets and CBR properties is presented. Rough set theory is incorporated to reduce the number of feature terms through generating reducts. On the other hand, two concepts of case coverage and case reachability in CBR are used in selecting the representative documents. The main contribution of this paper is that both the number of features and the documents are reduced with minimal loss of useful information. Some experiments are conducted on the text datasets of Reuters21578. The experimental results show that, although the number of feature terms and documents are reduced greatly, the problem-solving quality in terms of classification accuracy is still preserved.
ISBN: 0-7803-8403-2
DOI: 10.1109/ICMLC.2004.1382212
Appears in Collections:Conference Paper

View full-text via PolyU eLinks SFX Query
Show full item record

Page view(s)

Last Week
Last month
Citations as of Aug 14, 2018

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.