Please use this identifier to cite or link to this item:
Title: A graph model of data and workflow provenance
Authors: Acar, U
Buneman, P
Cheney, J
Issue Date: 2010
Source: Proceedings TaPP'10.
Abstract: Provenance has been studied extensively in both database and workflow management systems, so far with little convergence of definitions or models. Provenance in databases has generally been defined for relational or complex object data, by propagating fine-grained annotations or algebraic expressions from the input to the output. This kind of provenance has been found useful in other areas of computer science: annotation databases, probabilistic databases, schema and data integration, etc. In contrast, workflow provenance aims to capture a complete description of evaluation – or enactment – of a workflow, and this is crucial to verification in scientific computation. Workflows and their provenance are often presented using graphical notation, making them easy to visualize but complicating the formal semantics that relates their run-time behavior with their provenance records. We bridge this gap by extending a previously-developed dataflow language which supports both database-style querying and workflow-style batch processing steps to produce a workflow-style provenance graph that can be explicitly queried. We define and describe the model through examples, present queries that extract other forms of provenance, and give an executable definition of the graph semantics of dataflow expressions.
Notes: Umut Acar, Max-Planck Institute for Software Systems; Peter Buneman and James Cheney, University of Edinburgh; Jan Van den Bussche and Natalia Kwasnikowska, Hasselt University; Stijn Vansummeren, Université Libre de Bruxelles
Document URI:
Link to publication:
Category: C2
Type: Proceedings Paper
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
buneman.pdfConference publication436.21 kBAdobe PDFView/Open
Show full item record

Page view(s)

checked on May 17, 2022


checked on May 17, 2022

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.