Please use this identifier to cite or link to this item:
Title: Using complex linguistic features in context-sensitive text classification techniques
Authors: Wong, AKS
Lee, JWT
Yeung, DS
Issue Date: 2005
Source: Proceedings of 2005 International Conference on Machine Learning and Cybernetics, 2005, 18-21 August 2005, Guangzhou, China, v. 5, p. 3183-3188
Abstract: Text classification (TC) is the task to automatically classify documents based on learned document features. Many popular TC models use simple occurrence of words in a document as features. They also commonly assume word occurrences to be statistically independent in their design. Although it is obvious that such assumption does not hold in general, these TC models have been robust and efficient in their task. Some recent studies have shown context-sensitive TC approaches, which take into consideration contexts in the form of word co-occurrences, have been able to perform better in general. On the other hand, there have been many studies in the use of complex linguistic or semantic features instead of simple word occurrences as features for information retrieval and classification tasks. While these complex features may intuitively have more relevance to the tasks concerned, results of these studies on their effectiveness have been mixed and not been conclusive. In this paper we present our investigation on the use of some complex linguistic features with context-sensitive TC method. Our experiment results show some potential advantages of such approach.
Keywords: Classification
Computational linguistics
Context-sensitive languages
Text analysis
Publisher: IEEE
ISBN: 0-7803-9091-1
DOI: 10.1109/ICMLC.2005.1527491
Appears in Collections:Conference Paper

View full-text via PolyU eLinks SFX Query
Show full item record

Page view(s)

Last Week
Last month
Citations as of Sep 16, 2020

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.