Internet and Web Applications and Services, International Conference on
Download PDF

Abstract

In this paper, we present an extension of Phil, a declarative language for filtering information from XML data. The proposed approach allows us to extract relevant data as well as to exclude useless and misleading contents from an XML document. Essentially, it combines ontology reasoning with an approximate pattern-matching engine which searches for patterns in a flexible way (i.e. modulo renaming, insertion, and deletion of XML items) and ranks the results w.r.t. their cost. The filtering process is guided by the syntax as well as the semantics of the XML documents, since it relies on both the document structure and the ontological information to which the document is related. Such information is retrieved by querying (possibly remote) ontology reasoners. Our methodology has been implemented in the XPhil system, which is written in Haskell. By using the XML benchmarking tool xmlgen, we have developed some scalable experiments which demonstrate the usefulness of our approach.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles