A comparison between term-independence retrieval models for ad hoc retrieval

Dang, EKF; Luk, RWP; Allan, J

doi:10.1145/3483612

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/98870

DC Field	Value	Language
dc.contributor	Department of Computing	en_US
dc.creator	Dang, EKF	en_US
dc.creator	Luk, RWP	en_US
dc.creator	Allan, J	en_US
dc.date.accessioned	2023-06-01T06:05:18Z	-
dc.date.available	2023-06-01T06:05:18Z	-
dc.identifier.issn	1046-8188	en_US
dc.identifier.uri	http://hdl.handle.net/10397/98870	-
dc.language.iso	en	en_US
dc.publisher	Association for Computing Machinary	en_US
dc.rights	© 2021 Association for Computing Machinery. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM Transactions on Information Systems, http://dx.doi.org/10.1145/3483612.	en_US
dc.subject	Comparison	en_US
dc.subject	Evaluation	en_US
dc.subject	Information retrieval	en_US
dc.subject	Multiple hypotheses testing	en_US
dc.subject	Retrieval model	en_US
dc.title	A comparison between term-independence retrieval models for ad hoc retrieval	en_US
dc.type	Journal/Magazine Article	en_US
dc.identifier.volume	40	en_US
dc.identifier.issue	3	en_US
dc.identifier.doi	10.1145/3483612	en_US
dcterms.abstract	In Information Retrieval, numerous retrieval models or document ranking functions have been developed in the quest for better retrieval effectiveness. Apart from some formal retrieval models formulated on a theoretical basis, various recent works have applied heuristic constraints to guide the derivation of document ranking functions. While many recent methods are shown to improve over established and successful models, comparison among these new methods under a common environment is often missing. To address this issue, we perform an extensive and up-To-date comparison of leading term-independence retrieval models implemented in our own retrieval system. Our study focuses on the following questions: (RQ1) Is there a retrieval model that consistently outperforms all other models across multiple collections; (RQ2) What are the important features of an effective document ranking function? Our retrieval experiments performed on several TREC test collections of a wide range of sizes (up to the terabyte-sized Clueweb09 Category B) enable us to answer these research questions. This work also serves as a reproducibility study for leading retrieval models. While our experiments show that no single retrieval model outperforms all others across all tested collections, some recent retrieval models, such as MATF and MVD, consistently perform better than the common baselines.	en_US
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	ACM transactions on information systems, July 2022, v. 40, no. 3, 62	en_US
dcterms.isPartOf	ACM transactions on information systems	en_US
dcterms.issued	2022-07	-
dc.identifier.scopus	2-s2.0-85127622824	-
dc.identifier.eissn	1558-2868	en_US
dc.identifier.artn	62	en_US
dc.description.validate	202305 bcww	en_US
dc.description.oa	Accepted Manuscript	en_US
dc.identifier.FolderNumber	a2050	-
dc.identifier.SubFormID	46378	-
dc.description.fundingSource	Others	en_US
dc.description.fundingText	HK PolyU project P0030932	en_US
dc.description.pubStatus	Published	en_US
dc.description.oaCategory	Green (AAM)	en_US
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
Dang_Comparison_Term-independence_Retrieval .pdf	Pre-Published version	753.57 kB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Final Accepted Manuscript

Access

View full-text via PolyU eLinks

Show simple item record

Page views

82

Citations as of Apr 14, 2025

Downloads

97

Citations as of Apr 14, 2025

SCOPUS^TM
Citations

3

Citations as of Sep 12, 2025

Google Scholar^TM

Check