Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/108910
Title: Prediction system in big data analytics
Authors: Tang, Wai Man
Degree: Ph.D.
Issue Date: 2024
Abstract: Forecasting and causality are essential to decision making and resource management by relating exogenous factors or events. In addition, investment return prediction is crucial to have proper risk control and management. Nowadays, applications using advanced technologies are involved in our daily life. Big data can be collected easier in lower cost. Knowledge can be extracted to indicate important changes in the time series of data, where exogenous factors or events should fit for the purpose, as they can be instantaneous or aggregated in certain duration. Prediction and causality are some key functions in data analysis, where models can be used to extract useful features and predict data trends. Feature selection and extraction are crucial methodologies in data analysis, where sequential data is transformed into suitable features for further analysis. Relevant factors or features should be selected, which embed essential information to explain the dependent variable. This is critical to ensure useful models and accurate results.
In this thesis, our works focus on two key types of methods, they are conjoining spatio-temporal data for analysis by neural networks with deep learning, and novel factor subset selection in time-frequency representation. Applications in various aspects are studied. Chapter 2 investigates traffic speed data for multi-timestep forecasting. Congestion speed-cycle patterns of the target road segment are correlated to those of the nearby road segments. Appropriate input subset can be selected for neural network training with deep learning when input data dimensions are minimal. Chapter 3 investigates short-time Fourier Transform (STFT), where consistent patterns are used to identify factor subsets. Multi-factor model with factors in different timeframes should be more useful and practical to forecast future movements in the dynamic environment. Finally, Chapter 4 investigates wavelet transforms, and significant wavelet coefficients can be chosen as peaks by using continuous wavelet transform (CWT). Causality can be established by multiple factor models. Factor subsets are selected by factors with sample lags, which are represented by selecting appropriate wavelet coefficients in terms of both time and frequency.
Subjects: Big data
Data mining
Machine learning
Hong Kong Polytechnic University -- Dissertations
Pages: xiii, 152 pages : color illustrations
Appears in Collections:Thesis

Show full item record

Page views

107
Citations as of Nov 10, 2025

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.