计算机应用 ›› 2009, Vol. 29 ›› Issue (10): 2778-2780.

• 数据挖掘 • 上一篇    下一篇

一种基于有序对的含父子边的小枝模式匹配算法

王瑞,陶世群   

  1. 山西大学 计算机与信息技术学院
  • 收稿日期:2009-04-07 修回日期:2009-05-23 发布日期:2009-10-28 出版日期:2009-10-01
  • 通讯作者: 王瑞

Matching algorithm for twig patterns with parent-child edges based on ordered pair

  • Received:2009-04-07 Revised:2009-05-23 Online:2009-10-28 Published:2009-10-01

摘要: 随着Internet的发展和网上XML数据规模的与日剧增,如何准确、高效地查询XML数据已经成为研究的热点问题。目前,已经提出了很多小枝模式匹配算法,但没有解决含有父子边的小枝模式查询。针对该问题,提出了一种基于有序对的新算法PCTwig,通过在查询树和文档树上分别建立父子关系的有序对来进行查询。查询过程中避免了产生中间结果,也不需要进行归并操作,实验证明该算法是有效的。

关键词: XML文档, XPath, 小枝模式匹配, 有序对, 父子关系

Abstract: With the development of Internet and the constantly increasing scale of XML data, how to query the XML data exactly and efficiently becomes a hot issue. At present, there are many algorithms for twig pattern matching, but they dont have good method to solve the twigs which have parent-child edges. The new algorithm called PCTwig was proposed for this problem, which was based on the ordered pair. The twigs were queried through setting the ordered pair of parent-child relationship on query tree and document tree. In query process, it can avoid useless intermediate result and merge operation. The experiment shows the effectiveness of the approach.

Key words: XML document, XPath, twig pattern matching, ordered pair, parent-child relationship

中图分类号: