Please use this identifier to cite or link to this item:
Title: Optimal bandwidth selection for re-substitution entropy estimation
Authors: He, YL
Liu, JNK
Wang, XZ
Hu, YX
Issue Date: 2012
Source: Applied mathematics and computation, 2012, v. 219, no. 8, p. 3425-3460
Abstract: A new fusion approach of selecting an optimal bandwidth for re-substitution entropy estimator (RE) is presented in this study. When approximating the continuous entropy with density estimation, two types of errors will be generated: entropy estimation error (type-I error) and density estimation error (type-II error). These two errors are all strongly dependent on the undetermined bandwidths. Firstly, an experimental conclusion based on 24 typical probability distributions is demonstrated that there is some inconsistency between the optimal bandwidths associated with these two errors. Secondly, two different error measures for type-I and type-II errors are derived. A trade-off between type-I and type-II errors is a fundamental and potential property of our proposed method called REI+II. Thus, the fusion of these two errors is conducted and an optimal bandwidth for REI+II is solved. Finally, the experimental comparisons are carried out to verify the estimation performance of our proposed strategy. The discretization method is deemed to be the necessary preprocessing technology for the calculation of continuous entropy traditionally. So, the nine mostly used unsupervised discretization methods are introduced to give comparison of their computational performances with that of REI+II. And, five most popular estimators for entropy approximation are also plugged into our comparisons: splitting data estimator (SDE), cross-validation estimator (CVE), m-spacing estimator (mSE), mn-spacing estimator (mnSE), and nearest neighbor distance estimator (NNDE). The simulation studies on 24 different typical density distributions show that REI+II can obtain the better estimation performance among the involved methods. Meanwhile, the estimation behaviors of different entropy estimation methods are also revealed based on the comparative results. The empirical analysis demonstrates that REI+II is more insensitive to data and a better generalizable way for the estimation of continuous entropy. REI+II makes it possible for a handy optimal bandwidth to be derived from a given dataset.
Keywords: Discretization
Information entropy
Integrated mean square error
Optimal bandwidth
Probability density estimation
Re-substitution entropy estimator
Publisher: Elsevier
Journal: Applied mathematics and computation 
ISSN: 0096-3003
EISSN: 1873-5649
DOI: 10.1016/j.amc.2012.08.056
Appears in Collections:Journal/Magazine Article

View full-text via PolyU eLinks SFX Query
Show full item record


Last Week
Last month
Citations as of Sep 7, 2020


Last Week
Last month
Citations as of Sep 18, 2020

Page view(s)

Last Week
Last month
Citations as of Nov 22, 2020

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.