Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/13105
Title: The complexity of text-preserving XML transformations
Authors: ANTONOPOULOS, Timos 
MARTENS, Wim 
NEVEN, Frank 
Issue Date: 2011
Publisher: ACM Press
Source: Lenzerini, Maurizio; Schwentick, Thomas (Ed.). Proceedings of the 30th Symposium on Principles of Database Systems, ACM Press,p. 247-258
Abstract: While XML is nowadays adopted as the de facto standard for data exchange, historically, its predecessor SGML was invented for describing electronic documents, i.e., marked up text. Actually, today there are still large volumes of such XML texts. We consider simple transformations which can change the internal structure of documents, that is, the mark-up, and can filter out parts of the text but do not disrupt the ordering of the words. Specifically, we focus on XML transformations where the transformed document is a subsequence of the input document when ignoring mark-up. We call the latter text-preserving XML transformations. We characterize such transformations as copy- and rearrange-free transductions. Furthermore, we study the problem of deciding whether a given XML transducer is text-preserving over a given tree language. We consider top-down transducers as well as the abstraction of XSLT called DTL. We show that deciding whether a transformation is text-preserving over an unranked regular tree language is in PTime for top-down transducers, EXPTime-complete for DTL with XPath, and decidable for DTL with MSO patterns. Finally, we obtain that for every transducer in one of the above mentioned classes, the maximal subset of the input schema can be computed on which the transformation is text-preserving.
Keywords: Algorithms; theory; verification
Document URI: http://hdl.handle.net/1942/13105
ISBN: 978-1-4503-0660-7
DOI: 10.1145/1989284.1989316
Category: C1
Type: Proceedings Paper
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
p247-antonopoulos.pdf
  Restricted Access
Published version719.61 kBAdobe PDFView/Open    Request a copy
Show full item record

SCOPUSTM   
Citations

2
checked on Sep 2, 2020

Page view(s)

78
checked on Sep 5, 2022

Download(s)

60
checked on Sep 5, 2022

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.