Please use this identifier to cite or link to this item:
Title: Efficient in-memory extensible inverted file
Authors: Luk, RWP 
Lam, W
Keywords: Indexing
Information retrieval
Issue Date: 2007
Publisher: Pergamon-Elsevier Science Ltd
Source: Information systems, 2007, v. 32, no. 5, p. 733-754 How to cite?
Journal: Information Systems 
Abstract: The growing amount of on-line data demands efficient parallel and distributed indexing mechanisms to manage large resource requirements and unpredictable system failures. Parallel and distributed indices built using commodity hardware like personal computers (PCs) can substantially save cost because PCs are produced in bulk, achieving the scale of economy. However, PCs have limited amount of random access memory (RAM) and the effective utilization of RAM for in-memory inversion is crucial. This paper presents an analytical investigation and an empirical evaluation of storage-efficient inmemory extensible inverted files, which are represented by fixed- or variable-sized linked list nodes. The size of these linked list nodes is determined by minimizing the storage wastes or maximizing storage utilization under different conditions, which lead to different storage allocation schemes. Minimizing storage wastes also reduces the number of address indirections (i.e., chaining). We evaluated our storage allocation schemes using a number of reference collections. We found that the arrival rate scheme is the best in terms of both storage utilization and the mean number of chainings per term. The final storage utilization can be over 90% in our evaluation if there is a sufficient number of documents indexed. The mean number of chainings is not large (less than 2.6 for all the reference collections). We have also showed that our best storage allocation scheme can be used for our extensible compressed inverted file. The final storage utilization of the extensible compressed inverted file can be over 90% in our evaluation provided that there is a sufficient number of documents indexed. The proposed storage allocation schemes can also be used by compressed extensible inverted files with word positions.
DOI: 10.1016/
Appears in Collections:Journal/Magazine Article

View full-text via PolyU eLinks SFX Query
Show full item record


Last Week
Last month
Citations as of Aug 11, 2018


Last Week
Last month
Citations as of Jul 22, 2018

Page view(s)

Last Week
Last month
Citations as of Aug 20, 2018

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.