Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/117636
| Title: | Scalable flight cancellation prediction with ensemble distributed KNN and feature selection | Authors: | Kan, HY Chau, K Pang, P |
Issue Date: | 2025 | Source: | Scientific reports, 2025, v. 15, 34936 | Abstract: | Flight cancellation prediction accuracy remains essential for airlines because it allows for automatic risk reduction of financial losses and passenger satisfaction decline. Heavy aviation big data presents challenges to traditional prediction methods which makes their practical use difficult. The proposed research brings forth an innovative approach utilizing distributed ensemble learning for conducting flight cancellation predictions at scale. The Artificial Bee Colony (ABC) algorithm operates within our method to determine the most essential predictors from an extensive dataset through optimal feature selection. The MapReduce framework enables distributed K-Nearest Neighbor (DKNN) model implementation to process features selected by the subsequent stage. The distribution of KNN models within this architecture allows the processing of extensive datasets effectively and delivers better accuracy through a collective model voting system. Our system performs computations on flight data collected from three New York City airports (JFK, LGA, and EWR) with a minimum computational advantage exceeding 25% above non-distributed KNN models. The ensemble strategy enhances prediction accuracy by 3.42% to obtain an average accuracy level of 95.79% which represents a 2.2% improvement above previous methods. Our distributed ensemble methodology proves its effectiveness for predicting flight cancellations accurately in big data environments through the presented experimental results. | Keywords: | Big data Distributed k nearest neighbors (DKNN) Ensemble learning Flight cancellation prediction MapReduce |
Publisher: | Nature Publishing Group | Journal: | Scientific reports | EISSN: | 2045-2322 | DOI: | 10.1038/s41598-025-18716-1 | Rights: | Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/. © The Author(s) 2025 The following publication Kan, H.Y., Chau, K. & Cheong-iao Pang, P. Scalable flight cancellation prediction with ensemble distributed KNN and feature selection. Sci Rep 15, 34936 (2025) is available at https://doi.org/10.1038/s41598-025-18716-1. |
| Appears in Collections: | Journal/Magazine Article |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| s41598-025-18716-1.pdf | 5.76 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.



