Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/86995
Title: | Iterative subspace text categorization | Authors: | Chik, Cho-yiu Francis | Degree: | M.Phil. | Issue Date: | 2013 | Abstract: | Text categorization finds many practical applications. The dominant approach involves the use of various machine learning techniques where classification rules are automatically created using information from labeled texts. The proposed method to combat the curse of dimensionality is subspace methodology. However, this has only been applied broadly in unsupervised text categorization. The performance of subspace methodology on supervised text categorization has not yet been found. The approach of iterative subspace method of pattern classification is investigated. For the topic pairs of "carcass_livestock" and "soybean_oilseed" from the Reuters-21578 collection, the results with confidence level greater than 95% under 8-fold/10-fold/12-fold cross validation shows the potential of this approach. It is expected that the performance can be further improved by using other optimization techniques. It is still promising that there is 8.24% precision improvement of "livestock" evaluated comparing to 1-level classifier, standard Support Vector Machine (SVM), under 8-fold cross validation. There is also 11.85% improvement of "nat-gas" evaluated comparing to Soft Margin SVM classifier under 8-fold cross validation. | Subjects: | Text processing (Computer science) Artificial intelligence. Hong Kong Polytechnic University -- Dissertations |
Pages: | xi, 152 leaves : ill. ; 30 cm. |
Appears in Collections: | Thesis |
Access
View full-text via https://theses.lib.polyu.edu.hk/handle/200/7229
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.