Please use this identifier to cite or link to this item:
Title: Novel mutation operators & their application to regularization of artificial neural networks
Authors: Chan, Shun-heng
Degree: Ph.D.
Issue Date: 2003
Abstract: In recent years, the advances in Artificial Intelligence and the availability of high-speed computers have renewed interest in many system-modeling methods that were previously considered too "heuristic" (non-algorithm) or computationally intensive. In this thesis, we study three of these modern methods: Evolutionary Computation (EC), a derivative-free, global optimization method that consists of the branches of Genetic Algorithm (GA) and Evolutionary Strategies (ES); Markov Chain Monte Carlo (MCMC), a stochastic simulation method for approximating high-dimensional integration and Artificial Neural Networks (ANNs), a connectionist structure capable of learning complex, nonlinear mappings. In contrast to the traditional methods that use the reductionist approaches of linear modeling and local optimization, these modern methods abandon many simplistic assumptions to provide a more complex, yet comprehensive approach to the problem. While these modern methods provide a useful alternative to the traditional methods, they too, have their own weaknesses that hinder the utilization of their full potential. This thesis is on the design of two novel mutation operators that alleviate the weaknesses of GA and ES (GA and ES are branches of Evolutionary Computation) and their application to the hybrid system of GA-ANN and MCMC-ANN to provide the complete parameter estimation - including both conventional (frequentist) and Bayesian parameter estimation - of regularized ANNs. The first novel mutation operator is called MARK, which alleviates the problem of premature convergence of GA. In contrast to the conventional mutation operator that uses probabilistic (soft) operation, MARK uses deterministic (hard) operation. Probabilistic analyses with uni-modal functions and empirical experiments with test functions show that MARK converges faster and achieves higher precision than the conventional mutation operator. To verify its feasibility to complex, engineering problems, it is applied to a hybrid system of GA-ANN that performs conventional parameter estimation, or optimization, of regularized ANN. Experiment involves optimizing a regularized ANN electricity load forecaster that has 507 connection weights and 75 hyperparameters and has shown that the incorporation of MARK significantly improves the performance of the hybrid system. The second novel mutation operator is called Selection Follower (SF), which enhances the covariance-adaptive search in ES. SF replaces the conventional covariance evaluation with linear recombination for producing correlated noise, therefore saving computation intensity and removing computation error simultaneously. The principle of SF is proven with the general probability theory and its performance is shown in the empirical experiments with common test functions. To verify its feasibility to complex problems, SF is incorporated in a multi-chain MCMC algorithm called the Adaptive Metropolis Sampling (AMS) for Bayesian parameter estimation of regularized ANNs. Using EC-based mutation operator in MCMC is itself, a very innovative procedure, involving the identification of similarities between EC and MCMC and establishing theoretical basis for the technical feasibility. Experiments with simple normal distributions and the regularized ANN show that the incorporation of SF enhances the performance, in terms of mixing rate, of the hybrid system. It is concluded that these two novel mutation operators, MARK and SF, are useful in enhancing the performance in EC and their corresponding applications of parameter estimation of regularized ANN.
Subjects: Evolutionary computation
Neural networks (Computer science)
Hong Kong Polytechnic University -- Dissertations
Pages: xiv, 151 leaves : ill. ; 30 cm
Appears in Collections:Thesis

Show full item record

Page views

Citations as of Aug 7, 2022

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.