Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/96209
PIRA download icon_1.1View/Download Full Text
Title: Exploit every bit : effective caching for high-dimensional nearest neighbor search
Authors: Tang, B 
Yiu, ML 
Hua, KA
Issue Date: May-2016
Source: IEEE transactions on knowledge and data engineering, May 2016, v. 28, no. 5, 7374714, p. 1175-1188
Abstract: High-dimensional k nearest neighbor (kNN) search has a wide range of applications in multimedia information retrieval. Existing disk-based k NN search methods incur significant I/O costs in the candidate refinement phase. In this paper, we propose to cache compact approximate representations of data points in main memory in order to reduce the candidate refinement time during k NN search. This problem raises two challenging issues: (i) which is the most effective encoding scheme for data points to support k NN search? and (ii) what is the optimal number of bits for encoding a data point? For (i), we formulate and solve a novel histogram optimization problem that decides the most effective encoding scheme. For (ii), we develop a cost model for automatically tuning the optimal number of bits for encoding points. In addition, our approach is generic and applicable to exact/approximate k NN search methods. Extensive experimental results on real datasets demonstrate that our proposal can accelerate the candidate refinement time of k NN search by at least an order of magnitude.
Keywords: Caching
High dimensional data
Histogram
Similarity search
Publisher: Institute of Electrical and Electronics Engineers
Journal: IEEE transactions on knowledge and data engineering 
ISSN: 1041-4347
EISSN: 1558-2191
DOI: 10.1109/TKDE.2016.2515603
Rights: © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
The following publication B. Tang, M. L. Yiu and K. A. Hua, "Exploit Every Bit: Effective Caching for High-Dimensional Nearest Neighbor Search," in IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 5, pp. 1175-1188, 1 May 2016 is available at https://doi.org/10.1109/TKDE.2016.2515603.
Appears in Collections:Journal/Magazine Article

Files in This Item:
File Description SizeFormat 
Exploit_Every_Bit.pdfPre-Published version1.21 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page views

156
Last Week
0
Last month
Citations as of Oct 13, 2024

Downloads

140
Citations as of Oct 13, 2024

SCOPUSTM   
Citations

7
Citations as of Oct 17, 2024

WEB OF SCIENCETM
Citations

4
Citations as of Oct 17, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.