Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/13632
Title: Mining frequent itemsets in a stream
Authors: Calders, Toon
Dexters, Nele
GILLIS, Joris 
GOETHALS, Bart 
Issue Date: 2014
Source: INFORMATION SYSTEMS, 39, p. 233-255
Abstract: Mining frequent itemsets in a datastream proves to be a difficult problem, as itemsets arrive in rapid succession and storing parts of the stream is typically impossible. Nonetheless, it has many useful applications; e.g., opinion and sentiment analysis from social networks. Current stream mining algorithms are based on approximations. In earlier work, mining frequent items in a stream under the max-frequency measure proved to be effective for items. In this paper, we extended our work from items to itemsets. Firstly, an optimized incremental algorithm for mining frequent itemsets in a stream is presented. The algorithm maintains a very compact summary of the stream for selected itemsets. Secondly, we show that further compacting the summary is non-trivial. Thirdly, we establish a connection between the size of a summary and results from number theory. Fourthly, we report results of extensive experimentation, both of synthetic and real-world datasets, showing the efficiency of the algorithm both in terms of time and space.
Notes: Gillis, JJM (reprint author),Hasselt Univ, Agoralaan Gebouw D, B-3590 Diepenbeek, Belgium, joris.gillis@uhasselt.be
Keywords: Frequent itemset mining; Datastream; Theory; Algorithm; Experiments
Document URI: http://hdl.handle.net/1942/13632
ISSN: 0306-4379
e-ISSN: 1873-6076
DOI: 10.1016/j.is.2012.01.005
ISI #: 000329531300012
Category: A1
Type: Journal Contribution
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
paper.pdfMain article420.41 kBAdobe PDFView/Open
calders 1.pdfpublished version1.61 MBAdobe PDFView/Open
Show full item record

SCOPUSTM   
Citations

40
checked on Sep 2, 2020

WEB OF SCIENCETM
Citations

39
checked on Dec 4, 2022

Page view(s)

58
checked on Sep 7, 2022

Download(s)

358
checked on Sep 7, 2022

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.