Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/117695
Title: Dataset preprocessing and optimization for machine learning-based building load prediction : a review
Authors: Gao, DC
Zhang, X
Zhang, Y
Gao, Y
Zou, W 
Shan, K 
Issue Date: 1-Feb-2026
Source: Energy, 1 Feb. 2026, v. 344, 139992
Abstract: Machine learning is widely recognized as a crucial solution in the field of building load forecasting. Datasets play a pivotal role in machine learning, as their quality and characteristics directly determine the performance of prediction models. However, current research predominantly focuses on algorithms or summarizing the overall prediction process, often neglecting the processing and optimization of datasets for algorithmic models. This oversight limits the accuracy and reliability of prediction models in practical applications. To address this issue, this review centers on building load forecasting based on machine learning and organizes the existing research on enhancing model performance from the perspective of dataset processing and optimization. Firstly, it reviews the different building types and their corresponding data characteristics within existing building load forecasting. It constructs an enhanced framework by focusing on dimensions like feature optimization and sample data optimization. Secondly, this review comprehensively analyzes the complex impact of data characteristics from various building types on prediction accuracy. It delves deeply into data acquisition sources and data quality defects, presenting corresponding solutions. Furthermore, it evaluates multiple methods for feature selection and extraction, along with the applicable scenarios for different methods. Finally, this review discusses systematic strategies for sample data optimization. This study rigorously examines machine learning aspects in preprocessing datasets for building load prediction, offering systematic support and multidimensional insights for dataset-based algorithmic model optimization.
Keywords: Building load prediction
Dataset preprocessing
Feature extraction
Machine learning
Publisher: Pergamon Press
Journal: Energy 
ISSN: 0360-5442
EISSN: 1873-6785
DOI: 10.1016/j.energy.2026.139992
Appears in Collections:Journal/Magazine Article

Open Access Information
Status embargoed access
Embargo End Date 2028-02-01
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.