Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/85453
DC FieldValueLanguage
dc.contributorDepartment of Computing-
dc.creatorFung, Yiu-ming-
dc.identifier.urihttps://theses.lib.polyu.edu.hk/handle/200/4022-
dc.language.isoEnglish-
dc.titleClassification of heterogeneous gene expression data-
dc.typeThesis-
dcterms.abstractThe introduction of DNA microarrays technology is a breakthrough technology to identification of cancer types by examining the difference of gene expression levels between normal and cancer tissues in various cancer types. This technology can have significant contribution to cancer study since morphologically similar, but molecularly different, tumors can now be classified by their gene expression level differences. However, reliable and robust classification performance must be guaranteed. This can be achieved by validating classification algorithms using heterogeneous gene expression data since these data consist of two types of variations, which are variations in available microarray technologies and in different expression levels of significant genes in various cancer types. Classification algorithms, which produce reliable and robust performance when using heterogeneous gene expression data, are less sensitive to these variations. In this dissertation, we first develop the Impact Factor (IF) to measure interexperimental variations caused by the variations in microarray technologies between two data sets. The IF is then integrated into common classifiers, such as k-nearest neighbor classifiers, for classification of heterogeneous gene expression data. Furthermore, we also develop the Majority-voting with Impact Factors (MIF) algorithm, which makes use of the IF, the majority-voting classification algorithm, and the uniform histogram partitioning technique, to perform multi-type, heterogeneous cancer gene expression data classification. In order to demonstrate the reliability and robustness of the IF measure and MIF algorithm, 10 different data sets, which are published in 7 publications and conducted by different microarray technologies under various experimental settings and conditions, are experimented. The experimental results show good classification performance in terms of classification measurements of accuracy, sensitivity and specificity. For the MIF algorithm, we have also compared our results with other researchers' work. The comparisons show performance enhancement. In addition, a meta-classification algorithm using voting technique - bagging - is also compared for further performance evaluation. Surprisingly, the application of bagging does not have significant performance improvement, while the MIF algorithms also perform better performance.-
dcterms.accessRightsopen access-
dcterms.educationLevelM.Phil.-
dcterms.extentix, 123 leaves : ill. ; 30 cm-
dcterms.issued2005-
dcterms.LCSHHong Kong Polytechnic University -- Dissertations-
dcterms.LCSHGene expression-
dcterms.LCSHDNA microarrays-
dcterms.LCSHGenomics-
Appears in Collections:Thesis
Show simple item record

Page views

40
Last Week
0
Last month
Citations as of Apr 14, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.