Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/106952
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Electrical and Electronic Engineering-
dc.creatorWong, GY-
dc.creatorLeung, FHF-
dc.creatorLing, SH-
dc.date.accessioned2024-06-07T00:59:06Z-
dc.date.available2024-06-07T00:59:06Z-
dc.identifier.issn0020-0255-
dc.identifier.urihttp://hdl.handle.net/10397/106952-
dc.language.isoenen_US
dc.publisherElsevier Inc.en_US
dc.rights© 2018 Published by Elsevier Inc.en_US
dc.rights©2018. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/en_US
dc.rightsThe following publication Wong, G. Y., Leung, F. H., & Ling, S. H. (2018). A hybrid evolutionary preprocessing method for imbalanced datasets. Information Sciences, 454, 161-177 is available at https://doi.org/10.1016/j.ins.2018.04.068.en_US
dc.titleA hybrid evolutionary preprocessing method for imbalanced datasetsen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage161-
dc.identifier.epage177-
dc.identifier.volume454-455-
dc.identifier.doi10.1016/j.ins.2018.04.068-
dcterms.abstractImbalanced datasets are commonly encountered in real-world classification problems. Many machine learning algorithms are originally designed for well-balanced datasets, therefore re-sampling has become an important step to pre-process imbalanced data. This aims to balance the datasets by increasing the samples of the smaller class or decreasing the samples of the larger class, which are known as over-sampling and under-sampling, respectively. In this paper, a sampling strategy that is based on both over-sampling and under-sampling is proposed, in which the new samples of the smaller class are created based on fuzzy logic. Improvement of the datasets is done by the evolutionary computational method of Cross-generational elitist selection, Heterogeneous recombination and Cataclysmic mutation (CHC) that under-samples both the minority and majority samples. Consequently, a hybrid preprocessing method is proposed to re-sample imbalanced datasets. The evaluation is done by applying the Support Vector Machine (SVM), C4.5 decision tree and nearest neighbor rule to train a classification model from the re-sampled training sets. From the experimental results, it can be seen that our proposed method im- proves both the F −measure and AUC. The over-sampling rate and complexity of the classification model are also compared. Our proposed method is found to be superior to all other methods under comparison and it is more robust in different classifiers.-
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationInformation sciences, July 2018, v. 454-455, p. 161-177-
dcterms.isPartOfInformation sciences-
dcterms.issued2018-07-
dc.identifier.scopus2-s2.0-85046463531-
dc.identifier.eissn1872-6291-
dc.description.validate202405 bcch-
dc.description.oaAccepted Manuscripten_US
dc.identifier.FolderNumberEIE-0515en_US
dc.description.fundingSourceRGCen_US
dc.description.pubStatusPublisheden_US
dc.identifier.OPUS6837757en_US
dc.description.oaCategoryGreen (AAM)en_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
Leung_Hybrid_Evolutionary_Preprocessing.pdfPre-Published version1.42 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

84
Last Week
2
Last month
Citations as of Nov 9, 2025

Downloads

70
Citations as of Nov 9, 2025

SCOPUSTM   
Citations

54
Citations as of Dec 19, 2025

WEB OF SCIENCETM
Citations

45
Citations as of Dec 18, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.