Please use this identifier to cite or link to this item:
Title: A novel evolutionary data mining algorithm with applications to churn prediction
Authors: Au, WH
Chan, CC 
Yao, X
Keywords: Churn prediction
Customer retention
Data mining
Evolutionary computation
Genetic algorithms
Issue Date: 2003
Publisher: Institute of Electrical and Electronics Engineers
Source: IEEE transactions on evolutionary computation, 2003, v. 7, no. 6, p. 532-544 How to cite?
Journal: IEEE transactions on evolutionary computation 
Abstract: Classification is an important topic in data mining research. Given a set of data records, each of which belongs to one of a number of predefined classes, the classification problem is concerned with the discovery of classification rules that can allow records with unknown class membership to be correctly classified. Many algorithms have been developed to mine large data sets for classification models and they have been shown to be very effective. However, when it comes to determining the likelihood of each classification made, many of them are not designed with such purpose in mind. For this, they are not readily applicable to such problem as church prediction. For such an application, the goal is not only to predict whether or not a subscriber would switch from one carrier to another, it is also important that the likelihood of the subscriber's doing so be predicted. The reason for this is that a carrier can then choose to provide special personalized offer and services to those subscribers who are predicted with higher likelihood to churn. Given its importance, we propose a new data mining algorithm, called data mining by evolutionary learning (DMEL), to handle classification problems of which the accuracy of each predictions made has to be estimated. In performing its tasks, DMEL searches through the possible rule space using an evolutionary approach that has the following characteristics: 1) the evolutionary process begins with the generation of an initial set of first-order rules (i.e., rules with one conjunct/condition) using a probabilistic induction technique and based on these rules, of higher order (two or more conjuncts) are obtained iteratively; 2) when identifying interesting rules, an objective interestingness measure is used; 3) the fitness of a chromosome is defined in terms of the probability that the attribute values of a record can be correctly determined using the rules it encodes; and 4) the likelihood of predictions (or classifications) made are estimated so that subscribers can be ranked according to their likelihood to churn. Experiments with different data sets showed that DMEL is able to effectively discover interesting classification rules. In particular, it is able to predict churn accurately under different churn rates when applied to real telecom subscriber data.
ISSN: 1089-778X (print)
1941-0026 (online)
DOI: 10.1109/TEVC.2003.819264
Appears in Collections:Journal/Magazine Article

View full-text via PolyU eLinks SFX Query
Show full item record


Last Week
Last month
Citations as of Feb 6, 2019


Last Week
Last month
Citations as of Feb 18, 2019

Page view(s)

Last Week
Last month
Citations as of Feb 17, 2019

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.