Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/19766
Title: An iterative data mining approach for mining overlapping coexpression patterns in noisy gene expression data
Authors: Ma, PCH
Chan, KCC 
Issue Date: 2009
Publisher: Institute of Electrical and Electronics Engineers
Source: IEEE transactions on nanobioscience, 2009, v. 8, no. 3, p. 252-258 How to cite?
Journal: IEEE transactions on nanobioscience 
Abstract: Clustering is concerned with the discovery of groupings of records in a database. Many clustering problems are defined as partitioning problems in the sense that the similar records are grouped into nonoverlapping partitions. However, the clustering of gene expression data to discover coexpressed genes may not always be meaningful if this problem is reduced into a partitioning problem. Due to the complexity of the underlying biological processes, a protein can interact with one or more other proteins belonging to different functional classes in order to perform a particular biological role. For this reason, when responding to different external stimulants, a gene that produces a particular protein can coexpress with more than one group of other genes. The gene can therefore belong to more than one group of coexpressed genes. This poses a challenge to many clustering algorithms as they are not originally developed to discover overlapping clusters in noisy gene expression data. In this paper, we propose an iterative data mining approach that consists of two phases as follows. In phase 1, a clustering algorithm is used to discover the initial, nonoverlapping partitioning of gene expression profiles in gene expression data. Then, the partition memberships of genes are redetermined iteratively in phase 2 by a pattern discovery technique so as to determine that if a gene should remain in the same partition, be moved to another partition, or be also grouped together with other genes in another partitions. The proposed approach has been tested with both artificial and real datasets. Experimental results show that it can improve the performances of existing clustering algorithms and is able to effectively discover overlapping clusters in noisy gene expression data.
URI: http://hdl.handle.net/10397/19766
ISSN: 1536-1241
EISSN: 1558-2639
DOI: 10.1109/TNB.2009.2026747
Appears in Collections:Journal/Magazine Article

Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

3
Last Week
0
Last month
Citations as of Feb 26, 2017

WEB OF SCIENCETM
Citations

1
Last Week
0
Last month
0
Citations as of Aug 15, 2017

Page view(s)

34
Last Week
1
Last month
Checked on Aug 13, 2017

Google ScholarTM

Check

Altmetric



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.