Please use this identifier to cite or link to this item:
Title: Optimal bandwidth selection for re-substitution entropy estimation
Authors: He, YL
Liu, JNK
Wang, XZ
Hu, YX
Keywords: Discretization
Information entropy
Integrated mean square error
Optimal bandwidth
Probability density estimation
Re-substitution entropy estimator
Issue Date: 2012
Publisher: Elsevier
Source: Applied mathematics and computation, 2012, v. 219, no. 8, p. 3425-3460 How to cite?
Journal: Applied mathematics and computation 
Abstract: A new fusion approach of selecting an optimal bandwidth for re-substitution entropy estimator (RE) is presented in this study. When approximating the continuous entropy with density estimation, two types of errors will be generated: entropy estimation error (type-I error) and density estimation error (type-II error). These two errors are all strongly dependent on the undetermined bandwidths. Firstly, an experimental conclusion based on 24 typical probability distributions is demonstrated that there is some inconsistency between the optimal bandwidths associated with these two errors. Secondly, two different error measures for type-I and type-II errors are derived. A trade-off between type-I and type-II errors is a fundamental and potential property of our proposed method called REI+II. Thus, the fusion of these two errors is conducted and an optimal bandwidth for REI+II is solved. Finally, the experimental comparisons are carried out to verify the estimation performance of our proposed strategy. The discretization method is deemed to be the necessary preprocessing technology for the calculation of continuous entropy traditionally. So, the nine mostly used unsupervised discretization methods are introduced to give comparison of their computational performances with that of REI+II. And, five most popular estimators for entropy approximation are also plugged into our comparisons: splitting data estimator (SDE), cross-validation estimator (CVE), m-spacing estimator (mSE), mn-spacing estimator (mnSE), and nearest neighbor distance estimator (NNDE). The simulation studies on 24 different typical density distributions show that REI+II can obtain the better estimation performance among the involved methods. Meanwhile, the estimation behaviors of different entropy estimation methods are also revealed based on the comparative results. The empirical analysis demonstrates that REI+II is more insensitive to data and a better generalizable way for the estimation of continuous entropy. REI+II makes it possible for a handy optimal bandwidth to be derived from a given dataset.
ISSN: 0096-3003
EISSN: 1873-5649
DOI: 10.1016/j.amc.2012.08.056
Appears in Collections:Journal/Magazine Article

View full-text via PolyU eLinks SFX Query
Show full item record


Last Week
Last month
Citations as of Aug 21, 2018


Last Week
Last month
Citations as of Aug 16, 2018

Page view(s)

Last Week
Last month
Citations as of Aug 19, 2018

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.