Please use this identifier to cite or link to this item:
Title: Pattern discovery for large mixed-mode database
Authors: Wong, AKC
Wu, B
Chan, KCC 
Keywords: Attribute clustering
Data mining
Mixed mode data
Mutual information
Pattern discovery
Unsupervised discretization
Issue Date: 2010
Source: CIKM '10 Proceedings of the 19th ACM International Conference on Information and Knowledge Management, Toronto, Ontario, Canada, July 11-15, 2010, p. 859-868 How to cite?
Abstract: In business and industry today, large databases with mixed data types (continuous and categorical) are very common. There are great needs to discover patterns from them for knowledge interpretation and understanding. In the past, for classification, this problem is solved as a discrete data problem by first discretizing the continuous data based on the class-attribute interdependence relationship. However, so far no proper solution exists when class information is unavailable. Hence, important pattern post-processing tasks such as pattern clustering and summarization cannot be applied to mixed-mode data. This paper presents a new method for solving the problem. It is based on two essential concepts. (1) Though class information is absent, yet for a correlated dataset, the attribute with the strongest interdependence with others in the group can be used to drive the discretization of the continuous data. (2) For a large database, correlated attribute groups must first be obtained by attribute clustering before (1) can be applied. Based on (1) and (2), pattern discovery methods are developed for mixed-mode data. Extensive experiments using synthetic and real world data were conducted to validate the usefulness and effectiveness of the proposed method.
ISBN: 978-1-4503-0099-5
DOI: 10.1145/1871437.1871547
Appears in Collections:Conference Paper

View full-text via PolyU eLinks SFX Query
Show full item record


Last Week
Last month
Citations as of Oct 15, 2018

Page view(s)

Last Week
Last month
Citations as of Oct 15, 2018

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.