Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/10130
Title: Interpreting TF-IDF term weights as making relevance decisions
Authors: Wu, HC
Luk, RWP 
Wong, KF
Kwok, KL
Keywords: Information retrieval
Relevance decision
Term weight
Issue Date: 2008
Publisher: Assoc Computing Machinery
Source: ACM transactions on information systems, 2008, v. 26, no. 3, 13 How to cite?
Journal: ACM Transactions on Information Systems 
Abstract: A novel probabilistic retrieval model is presented. It forms a basis to interpret the TF-IDF term weights as making relevance decisions. It simulates the local relevance decision-making for every location of a document, and combines all of these "local" relevance decisions as the "document-wide" relevance decision for the document. The significance of interpreting TF-IDF in this way is the potential to: (1) establish a unifying perspective about information retrieval as relevance decision-making; and (2) develop advanced TF-IDF-related term weights for future elaborate retrieval models. Our novel retrieval model is simplified to a basic ranking formula that directly corresponds to the TF-IDF term weights. In general, we show that the term-frequency factor of the ranking formula can be rendered into different term-frequency factors of existing retrieval systems. In the basic ranking formula, the remaining quantity - log p(r̄t ∈ d) is interpreted as the probability of randomly picking a nonrelevant usage (denoted by ) of term t. Mathematically, we show that this quantity can be approximated by the inverse document-frequency (IDF). Empirically, we show that this quantity is related to IDF, using four reference TREC ad hoc retrieval data collections.
URI: http://hdl.handle.net/10397/10130
DOI: 10.1145/1361684.1361686
Appears in Collections:Journal/Magazine Article

Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

194
Last Week
1
Last month
2
Citations as of Jul 28, 2017

WEB OF SCIENCETM
Citations

92
Last Week
1
Last month
4
Citations as of Aug 13, 2017

Page view(s)

64
Last Week
4
Last month
Checked on Aug 14, 2017

Google ScholarTM

Check

Altmetric



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.