Efficient constraint evaluation in categorical sequential pattern mining for trajectory databases

Gomez, Letitia; VAISMAN, Alejandro

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/10651

Full metadata record

DC Field	Value	Language
dc.contributor.author	Gomez, Letitia	-
dc.contributor.author	VAISMAN, Alejandro	-
dc.date.accessioned	2010-03-04T09:43:00Z	-
dc.date.available	2010-03-04T09:43:00Z	-
dc.date.issued	2009	-
dc.identifier.citation	Kersten, Martin & Novikov, Boris & Teubner, Jens & Polutin, Vladimir & Manegold, Stefan (Ed.) Proceedings of the 12th International Conference on Extending Database Technology. p. 541-552.	-
dc.identifier.isbn	978-1-60558-422-5	-
dc.identifier.uri	http://hdl.handle.net/1942/10651	-
dc.description.abstract	The classic Generalized Sequential Patterns (GSP) algorithm returns all frequent sequences present in a database. However, usually a few ones are interesting from a user's point of view. Thus, post-processing tasks are required in order to discard uninteresting sequences. To avoid this drawback, languages based on regular expressions (RE) were proposed to restrict frequent sequences to the ones that satisfy user-specified constraints. In all of these languages, REs are applied over items, which limits their applicability in complex real-world situations. We propose a much powerful language, based on regular expressions, denoted RE-SPaM, where the basic elements are constraints defined over the (temporal and non-temporal) attributes of the items to be mined. Expressions in this language may include attributes, functions over attributes, and variables. We specify the syntax and semantics of RE-SPaM, and present a comprehensive set of examples to illustrate its expressive power. We study in detail how the expressions can be used to prune the resulting sequences in the mining process. In addition, we introduce techniques that allow pruning sequences in the early stages of the process, reducing the need to access the database, making use of the categorization of the attributes that compose the items, and of the automaton that accepts the language generated by the RE. Finally, we present experimental results. Although in this paper we focus on trajectory databases, our approach is general enough for being applied to other settings.	-
dc.language.iso	en	-
dc.publisher	ACM International Conference Proceeding Series	-
dc.title	Efficient constraint evaluation in categorical sequential pattern mining for trajectory databases	-
dc.type	Proceedings Paper	-
local.bibliographicCitation.authors	Kersten, Martin	-
local.bibliographicCitation.authors	Novikov, Boris	-
local.bibliographicCitation.authors	Teubner, Jens	-
local.bibliographicCitation.authors	Polutin, Vladimir	-
local.bibliographicCitation.authors	Manegold, Stefan	-
local.bibliographicCitation.conferencename	12th International Conference on Extending Database Technology	-
dc.bibliographicCitation.conferencenr	12	-
local.bibliographicCitation.conferenceplace	Saint Petersburg, Russia, March 24-26, 2009	-
dc.identifier.epage	552	-
dc.identifier.spage	541	-
local.bibliographicCitation.jcat	C1	-
local.type.specified	Proceedings Paper	-
dc.bibliographicCitation.oldjcat	C2	-
dc.identifier.url	http://doi.acm.org/10.1145/1516360.1516423	-
local.bibliographicCitation.btitle	Proceedings of the 12th International Conference on Extending Database Technology	-
item.fulltext	No Fulltext	-
item.accessRights	Closed Access	-
item.contributor	Gomez, Letitia	-
item.contributor	VAISMAN, Alejandro	-
item.fullcitation	Gomez, Letitia & VAISMAN, Alejandro (2009) Efficient constraint evaluation in categorical sequential pattern mining for trajectory databases. In: Kersten, Martin & Novikov, Boris & Teubner, Jens & Polutin, Vladimir & Manegold, Stefan (Ed.) Proceedings of the 12th International Conference on Extending Database Technology. p. 541-552..	-
Appears in Collections:	Research publications

Show simple item record

Google Scholar^TM

Check

Google ScholarTM

Altmetric

Google Scholar^TM