Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/13632
Title: Mining frequent itemsets in a stream
Authors: Calders, Toon
Dexters, Nele
GILLIS, Joris 
GOETHALS, Bart 
Issue Date: 2014
Source: INFORMATION SYSTEMS, 39, p. 233-255
Abstract: Mining frequent itemsets in a datastream proves to be a difficult problem, as itemsets arrive in rapid succession and storing parts of the stream is typically impossible. Nonetheless, it has many useful applications; e.g., opinion and sentiment analysis from social networks. Current stream mining algorithms are based on approximations. In earlier work, mining frequent items in a stream under the max-frequency measure proved to be effective for items. In this paper, we extended our work from items to itemsets. Firstly, an optimized incremental algorithm for mining frequent itemsets in a stream is presented. The algorithm maintains a very compact summary of the stream for selected itemsets. Secondly, we show that further compacting the summary is non-trivial. Thirdly, we establish a connection between the size of a summary and results from number theory. Fourthly, we report results of extensive experimentation, both of synthetic and real-world datasets, showing the efficiency of the algorithm both in terms of time and space.
Notes: Gillis, JJM (reprint author),Hasselt Univ, Agoralaan Gebouw D, B-3590 Diepenbeek, Belgium, joris.gillis@uhasselt.be
Keywords: Frequent itemset mining; Datastream; Theory; Algorithm; Experiments
Document URI: http://hdl.handle.net/1942/13632
ISSN: 0306-4379
e-ISSN: 1873-6076
DOI: 10.1016/j.is.2012.01.005
ISI #: 000329531300012
Category: A1
Type: Journal Contribution
Validations: ecoom 2015
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
paper.pdfPeer-reviewed author version420.41 kBAdobe PDFView/Open
calders 1.pdfPublished version1.61 MBAdobe PDFView/Open
Show full item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.