Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/86995
Title: Iterative subspace text categorization
Authors: Chik, Cho-yiu Francis
Degree: M.Phil.
Issue Date: 2013
Abstract: Text categorization finds many practical applications. The dominant approach involves the use of various machine learning techniques where classification rules are automatically created using information from labeled texts. The proposed method to combat the curse of dimensionality is subspace methodology. However, this has only been applied broadly in unsupervised text categorization. The performance of subspace methodology on supervised text categorization has not yet been found. The approach of iterative subspace method of pattern classification is investigated. For the topic pairs of "carcass_livestock" and "soybean_oilseed" from the Reuters-21578 collection, the results with confidence level greater than 95% under 8-fold/10-fold/12-fold cross validation shows the potential of this approach. It is expected that the performance can be further improved by using other optimization techniques. It is still promising that there is 8.24% precision improvement of "livestock" evaluated comparing to 1-level classifier, standard Support Vector Machine (SVM), under 8-fold cross validation. There is also 11.85% improvement of "nat-gas" evaluated comparing to Soft Margin SVM classifier under 8-fold cross validation.
Subjects: Text processing (Computer science)
Artificial intelligence.
Hong Kong Polytechnic University -- Dissertations
Pages: xi, 152 leaves : ill. ; 30 cm.
Appears in Collections:Thesis

Show full item record

Page views

6
Citations as of May 22, 2022

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.