Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/3777
Title: Classification of heterogeneous gene expression data
Authors: Fung, Yiu-ming
Keywords: Hong Kong Polytechnic University -- Dissertations
Gene expression
DNA microarrays
Genomics
Issue Date: 2005
Publisher: The Hong Kong Polytechnic University
Abstract: The introduction of DNA microarrays technology is a breakthrough technology to identification of cancer types by examining the difference of gene expression levels between normal and cancer tissues in various cancer types. This technology can have significant contribution to cancer study since morphologically similar, but molecularly different, tumors can now be classified by their gene expression level differences. However, reliable and robust classification performance must be guaranteed. This can be achieved by validating classification algorithms using heterogeneous gene expression data since these data consist of two types of variations, which are variations in available microarray technologies and in different expression levels of significant genes in various cancer types. Classification algorithms, which produce reliable and robust performance when using heterogeneous gene expression data, are less sensitive to these variations. In this dissertation, we first develop the Impact Factor (IF) to measure interexperimental variations caused by the variations in microarray technologies between two data sets. The IF is then integrated into common classifiers, such as k-nearest neighbor classifiers, for classification of heterogeneous gene expression data. Furthermore, we also develop the Majority-voting with Impact Factors (MIF) algorithm, which makes use of the IF, the majority-voting classification algorithm, and the uniform histogram partitioning technique, to perform multi-type, heterogeneous cancer gene expression data classification. In order to demonstrate the reliability and robustness of the IF measure and MIF algorithm, 10 different data sets, which are published in 7 publications and conducted by different microarray technologies under various experimental settings and conditions, are experimented. The experimental results show good classification performance in terms of classification measurements of accuracy, sensitivity and specificity. For the MIF algorithm, we have also compared our results with other researchers' work. The comparisons show performance enhancement. In addition, a meta-classification algorithm using voting technique - bagging - is also compared for further performance evaluation. Surprisingly, the application of bagging does not have significant performance improvement, while the MIF algorithms also perform better performance.
Description: ix, 123 leaves : ill. ; 30 cm.
PolyU Library Call No.: [THS] LG51 .H577M COMP 2005 Fung
URI: http://hdl.handle.net/10397/3777
Rights: All rights reserved.
Appears in Collections:Thesis

Files in This Item:
File Description SizeFormat 
b18099592_link.htmFor PolyU Users 162 BHTMLView/Open
b18099592_ir.pdfFor All Users (Non-printable) 1.82 MBAdobe PDFView/Open
Show full item record

Google ScholarTM

Check



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.