Please use this identifier to cite or link to this item:
Title: Uncertainty-based spatial association rule mining
Authors: Zhang, Anshu
Advisors: Shi, Wenzhong (LSGI)
Keywords: Geographic information systems
Geospatial data
Data mining
Issue Date: 2017
Publisher: The Hong Kong Polytechnic University
Abstract: Spatial association rule mining (SARM) is the discovery of implicit 'antecedent --> consequence' rules from spatial databases. SARM is an emerging topic in geographical information science (GISc) and a powerful tool in research and practice. The key to usefulness of SARM results is their reliability: the abundance of authentic rules, control over the risk of spurious rules, and goodness of rule interestingness measure (RIM) values. Such reliability, however, faces great challenges from uncertainties of various types and sources. Uncertainty-based SARM, proposed in this thesis, aims at enhancing the reliability of SARM results on all three aforesaid aspects via novel and improved uncertainty handling methods. In response to three critical uncertainty issues in SARM: data error, gradual/vague spatial concept, and uncertain concept modelling, this thesis realises the following three interrelated objectives: Mining significant spatial association rules from uncertain data: a new statistical test on the rules is developed to correct existing statistically sound test, which is indispensable for strict control over spurious rules, for distortions due to data error. The new test combines original data error propagation modelling as well as simulative processes. The new method can averagely compensate 50% loss of true rules due to data error, thus markedly enrich authentic results. Such efficacy is also largely robust to inaccurate data error information and dependent error probabilities in practical imperfect data. Gaussian-curve-based fuzzy data discretization and crisp-fuzzy SARM: a Gaussian-curve-based model is presented to strengthen spatial semantics in fuzzy data discretization. Also, crisp-fuzzy SARM is originated to synthesise statistically sound testing based on ordinary (crisp) SARM, and RIM evaluation based on fuzzy rules. The techniques can discover at least twice as many authentic rules as conventional fuzzy SARM; avoid large overestimations of RIM values, usually by more than 50%, in ordinary SARM; and keep minimal risk of spurious rules.
Genetic algorithm (GA) for crisp-fuzzy SARM: the new GA integrates the merits of statistical evaluation, new Gaussian-curve-based data discretization and crisp-fuzzy SARM. Experimentwise and generationwise adjusted statistical tests are innovated for the GA to satisfy different user needs. The proposed GA can produce several times as many rules, and as high RIM values as non-GA SARM. The risk of spurious rules is below low user specified levels for both testing approaches. The developments for the three objectives are proven effective and robust, through synthetic and real-world data experiments of various experimental settings and data conditions. Case studies for these developments on urbanization-socioeconomic changes, wildfire risks, and hotel room price determinants inject new findings in corresponding research topics. In sum, methods developed in this thesis can alleviate manifold uncertainty issues in SARM, thereby significantly improving the reliability of SARM results in all its three aforesaid aspects. As a systematic study on uncertainty handling in SARM, this thesis would enrich GISc theories and methodologies. Particularly, it answers the increasingly pressing need for quality and reliability studies in GISc. The thesis work is also practically useful in improving decision making and user services in various domains involving spatial data, as exemplified by the case studies.
Description: x, 155 pages : illustrations
PolyU Library Call No.: [THS] LG51 .H577P LSGI 2017 2017 Zhang
Rights: All rights reserved.
Appears in Collections:Thesis

Files in This Item:
File Description SizeFormat 
991021952840003411_link.htmFor PolyU Users167 BHTMLView/Open
991021952840003411_pira.pdfFor All Users (Non-printable)2.11 MBAdobe PDFView/Open
Show full item record

Page view(s)

Last Week
Last month
Citations as of Mar 18, 2018


Citations as of Mar 18, 2018

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.