Please use this identifier to cite or link to this item:
Title: Hydrological predictions using data-driven models coupled with data preprocessing techniques
Authors: Wu, Conglin
Keywords: Hong Kong Polytechnic University -- Dissertations
Hydrological forecasting
Hydrology -- Data processing
Hydrologic models.
Issue Date: 2010
Publisher: The Hong Kong Polytechnic University
Abstract: Data-driven models, particularly soft computing models, have become an appropriate alternative to knowledge-driven models in many hydrological prediction scenarios including rainfall, streamflow, and rainfall-runoff. The primary reason is that data-driven models rely solely on previous hydro-meteorological data without directly taking into account the underlying physical progress. However, it is inevitable that data-driven models introduce uncertainty to the forecasting as a result of over-simplified assumption, inappropriate training data, model inputs, model configuration, and even individual experience of modelers. This thesis makes an endeavor to improve the accuracy of hydrological forecasting in three aspects, model inputs, selection of models, and data-preprocessing techniques. Seven input techniques, namely, linear correlation analysis (LCA), false nearest neighbors, correlation integral, stepwise linear regression, average mutual information, partial mutual information, artificial neural network (ANN) based on multi-objective genetic algorithm, are first examined to select optimal model inputs in each prediction scenario. Representative models, such as K-nearest-neighbors (K-NN) model, dynamic system based model (DSBM), ANN, modular ANN (MANN), and hybrid artificial neural network-support vector regression (ANN-SVR), are then proposed to conduct rainfall and streamflow forecasts. Four data-preprocessing methods including moving average (MA), principal component analysis (PCA), singular spectrum analysis (SSA), and wavelet analysis (WA), are further investigated by integration with the abovementioned forecasting models. K-NN, ANN, and MANN are used to predict monthly and daily rainfall series with linear regression (LR) as the benchmark. The comparison of seven input techniques indicates that LCA is able to identify model inputs reasonably. In the normal mode (viz., without data preprocessing), MANN performs the best, but the advantage of MANN over ANN is not significant in monthly rainfall series forecasting. Compared with results in the normal mode, the improvement of the model performance generated by SSA is considerable whereas MA or PCA imposes negligible influence. Coupled with SSA, advantages of MANN over other models are quite noticeable, particularly for daily rainfall forecasting.
ANN, MANN, ANN-SVR, and DSBM are employed to conduct estimates of monthly and daily streamflow series where model inputs only depend on previous flow observations. The best model inputs are also identified by LCA. In the normal mode, the global DSBM model shows close performance to ANN. MANN and ANN-SVR tend to be replaceable by each other and are able to noticeably improve the accuracy of flow predictions, particularly for a non-smooth flow series, when compared to ANN. However, the prediction lag effect can be observed in daily streamflow series forecasting. In data preprocessing mode, both SSA and WA bring significant improvement of model performance, but SSA shows a remarkable superiority over WA. ANN, MANN, and LR are also used to perform daily rainfall-runoff (R-R) prediction where model inputs consist of previous rainfall and streamflow observations. The best model inputs are also attained by LCA. Irrespective of modes, the advantage of MANN over ANN is not obvious. Compared to models depending solely on previous flow data as inputs, these R-R models make more accurate predictions. However, the improvement tends to mitigate with the increase of forecasting horizons in the normal mode. The situation becomes reverse in the SSA mode where the advantage of the ANN R-R model becomes more significant as the prediction horizon increases. The findings above focused on results of point prediction, which uses the ANN-SSA R-R model. On the basis of this model, we complement this with the uncertainty estimation based on local errors and clustering (UNEEC) method so as to attain interval prediction of daily rainfall-runoff. The UNEEC method is then compared to the bootstrap method. Results indicate that the UNEEC performs better in locations of low flows whereas the bootstrap method proves to be well suited in locations of high flows. One of the major contributions of this research is the exploration of a viable modeling technique of coupling data-driven models with SSA. The technique has been tested with hydrological forecasts in rainfall, streamflow, and rainfall-runoff, and predicted results are in good agreement with observations.
Description: xi, 246 p. : ill. (some col.) ; 30 cm.
PolyU Library Call No.: [THS] LG51 .H577P CSE 2010 Wu
Rights: All rights reserved.
Appears in Collections:Thesis

Files in This Item:
File Description SizeFormat 
b23930640_link.htmFor PolyU Users 162 BHTMLView/Open
b23930640_ir.pdfFor All Users (Non-printable) 2.23 MBAdobe PDFView/Open
Show full item record
PIRA download icon_1.1View/Download Contents

Page view(s)

Last Week
Last month
Citations as of Dec 10, 2018


Citations as of Dec 10, 2018

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.