Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/700
Title: Information extraction from Web documents based on local unranked tree automaton inference
Authors: Kosala, Raymond
Bruynooghe, Maurice
VAN DEN BUSSCHE, Jan 
Blockeel, Hendrik
Issue Date: 2003
Publisher: Kaufman, Morgan
Source: Gottlob, G. & Walch, T. (Ed.) Proceedings of the 18th International Joint Conference on Artificial Intelligence. p. 403-408.
Abstract: Information extraction (IE) aims at extracting specific information from a collection of documents. A lot of previous work on IE from semi-structured documents (in XML or HTML) uses learning techniques based on strings. Some recent work converts the document to a ranked tree and uses tree automaton induction. This paper introduces an algorithm that uses unranked trees to induce an automaton. Experiments show that this gives the best results obtained so far for IE from semi-structured documents based on learning.
Document URI: http://hdl.handle.net/1942/700
Category: C2
Type: Proceedings Paper
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
datamining1.pdf197.45 kBAdobe PDFView/Open
Show full item record

Page view(s)

24
checked on Nov 7, 2023

Download(s)

10
checked on Nov 7, 2023

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.