Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/778
Title: Construction of weak and strong similarity measures for ordered sets of documents using fuzzy set techniques
Authors: EGGHE, Leo 
Michel, Ch.
Issue Date: 2003
Publisher: Elsevier
Source: Information Processing & Management, 39(5). p. 771-807
Abstract: Ordered sets of documents are encountered more and more in information distribution systems, such as information retrieval systems. Classical similarity measures for ordinary sets of documents hence need to be extended to these ordered sets. This is done in this paper using fuzzy set techniques. First a general similarity measure is developed which contains the classical strong similarity measures such as Jaccard, Dice, Cosine and which contains the classical weak similarity measures such as Recall and Precision. Then these measures are extended to comparing fuzzy sets of documents. Measuring the similarity for ordered sets of documents is a special case of this, where, the higher the rank of a document, the lower its weight is in the fuzzy set. Concrete forms of these similarity measures are presented. All these measures are new and the ones for the weak similarity measures are the first of this kind (other strong similarity measures have been given in a previous paper by Egghe and Michel). Some of these measures are then tested in the IR-system Profil-Doc. The engine SPIRITĀ© extracts ranked documents sets in three different contexts, each for 600 request. The practical useability of the OS-measures is then discussed based on these experiments.
Keywords: similarity measure; ordered set; fuzzy
Document URI: http://hdl.handle.net/1942/778
ISSN: 0306-4573
e-ISSN: 1873-5371
DOI: 10.1016/S0306-4573(02)00027-4
ISI #: 000184327400006
Category: A1
Type: Journal Contribution
Validations: ecoom 2004
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
constructionofweak.pdfNon Peer-reviewed author version826.48 kBAdobe PDFView/Open
weak 1.pdf
  Restricted Access
Published version1.67 MBAdobe PDFView/Open    Request a copy
Show full item record

SCOPUSTM   
Citations

24
checked on Sep 2, 2020

WEB OF SCIENCETM
Citations

23
checked on Apr 30, 2024

Page view(s)

56
checked on Sep 6, 2022

Download(s)

192
checked on Sep 6, 2022

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.