Please use this identifier to cite or link to this item:
http://hdl.handle.net/1942/45025
Title: | SpannerLib: Embedding Declarative Information Extraction in an Imperative Workflow | Authors: | Light, Dean Aiashy, Ahmad Diab, Mahmoud Nachmias, Daniel VANSUMMEREN, Stijn Kimelfeld, Benny |
Issue Date: | 2024 | Publisher: | ASSOC COMPUTING MACHINERY | Source: | Proceedings of the Vldb Endowment, 17 (12) , p. 4281 -4284 | Abstract: | Document spanners have been proposed as a formal framework for declarative Information Extraction (IE) from text, following IE products from the industry and academia. Over the past decade, the framework has been studied thoroughly in terms of expressive power, complexity, and the ability to naturally combine text analysis with relational querying. This demonstration presents SPANNERLIB-a library for embedding document spanners in Python code. SPANNERLIB facilitates the development of IE programs by providing an implementation of Spannerlog (Datalog-based document spanners) that interacts with the Python code in two directions: rules can be embedded inside Python, and they can invoke custom Python code (e.g., calls to ML-based NLP models) via user-defined functions. The demonstration scenarios showcase IE programs, with increasing levels of complexity, within Jupyter Notebook. | Notes: | Light, D (corresponding author), Technion, Haifa, Israel. dean.light92@gmail.com; ahmad-ai@campus.technion.ac.il; mahmoud.diab@campus.technion.ac.il; nach.daniel@gmail.com; stijn.vansummeren@uhasselt.be; bennyk@cs.technion.ac.il |
Document URI: | http://hdl.handle.net/1942/45025 | ISSN: | 2150-8097 | e-ISSN: | 2150-8097 | DOI: | 10.14778/3685800.3685855 | ISI #: | 001378223700007 | Rights: | This work is licensed under the Creative Commons BY-NC-ND 4.0 International License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of this license. For any use beyond those covered by this license, obtain permission by emailing info@vldb.org. Copyright is held by the owner/author(s). Publication rights licensed to the VLDB Endowment | Category: | A1 | Type: | Journal Contribution |
Appears in Collections: | Research publications |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
SpannerLib_ Embedding Declarative Information Extraction in an Imperative Workflow.pdf | Published version | 648.93 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.