Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/33418
Full metadata record
DC FieldValueLanguage
dc.contributor.authorFagin, Ronald-
dc.contributor.authorKimelfeld, Benny-
dc.contributor.authorReiss, Frederick-
dc.contributor.authorVANSUMMEREN, Stijn-
dc.date.accessioned2021-02-12T10:18:10Z-
dc.date.available2021-02-12T10:18:10Z-
dc.date.issued2013-
dc.date.submitted2021-02-11T10:43:45Z-
dc.identifier.citationProceedings of the 32nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, Association for Computing Machinery, p. 37 -48-
dc.identifier.isbn9781450320665-
dc.identifier.urihttp://hdl.handle.net/1942/33418-
dc.description.abstractAn intrinsic part of information extraction is the creation and manipulation of relations extracted from text. In this paper, we develop a foundational framework where the central construct is what we call a spanner. A spanner maps an input string into relations over the spans (intervals specified by bounding indices) of the string. The focus of this paper is on the representation of spanners. Conceptually , there are two kinds of such representations. Spanners defined in a primitive representation extract relations directly from the input string; those defined in an algebra apply algebraic operations to the primitively represented spanners. This framework is driven by SystemT, an IBM commercial product for text analysis , where the primitive representation is that of regular expressions with capture variables. We define additional types of primitive spanner representations by means of two kinds of automata that assign spans to variables. We prove that the first kind has the same expressive power as regular expressions with capture variables; the second kind expresses precisely the algebra of the regular spanners-the closure of the first kind under standard relational operators. The core spanners extend the regular ones by string-equality selection (an extension used in SystemT). We give some fundamental results on the expressiveness of regular and core spanners. As an example, we prove that regular spanners are closed under difference (and complement), but core spanners are not. Finally, we establish connections with related notions in the literature.-
dc.language.isoen-
dc.publisherAssociation for Computing Machinery-
dc.subject.otherH21 [Database Management]: Logical Design-Data models-
dc.subject.otherH24 [Database Management]: Systems-Textual databases, Re- lational databases, Rule-based databases-
dc.subject.otherI54 [Pattern Recogni- tion]: Applications-Text processing-
dc.subject.otherF43 [Mathematical Logic and Formal Languages]: Formal Languages-Algebraic language theory, Classes defined by grammars or automata, Operations on languages-
dc.subject.other[ [F]:11]Computation by Abstract DevicesModels of Computation[Automata, Relations between models] Keywords Information extraction, spanners, regular expressions, finite-state automata-
dc.titleSpanners: A Formal Framework for Information Extraction-
dc.typeProceedings Paper-
local.bibliographicCitation.conferencedateJune 2013-
local.bibliographicCitation.conferencenameSymposium on Principles of Database Systems-
local.bibliographicCitation.conferenceplaceNew York (New York), USA-
dc.identifier.epage48-
dc.identifier.spage37-
local.bibliographicCitation.jcatC1-
local.type.refereedRefereed-
local.type.specifiedProceedings Paper-
dc.identifier.doi10.1145/2463664.2463665-
local.provider.typeCrossRef-
local.bibliographicCitation.btitleProceedings of the 32nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems-
local.uhasselt.uhpubno-
local.uhasselt.internationalyes-
item.contributorFagin, Ronald-
item.contributorKimelfeld, Benny-
item.contributorReiss, Frederick-
item.contributorVANSUMMEREN, Stijn-
item.fullcitationFagin, Ronald; Kimelfeld, Benny; Reiss, Frederick & VANSUMMEREN, Stijn (2013) Spanners: A Formal Framework for Information Extraction. In: Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, Association for Computing Machinery, p. 37 -48.-
item.accessRightsRestricted Access-
item.fulltextWith Fulltext-
Appears in Collections:Research publications
Files in This Item:
File Description SizeFormat 
1-aqlf.pdf
  Restricted Access
Published version371.41 kBAdobe PDFView/Open    Request a copy
Show simple item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.