Journal of Computer Applications ›› 2010, Vol. 30 ›› Issue (9): 2317-2320.

• Database and knowledge engineering • Previous Articles     Next Articles

Research of XML regular path expression query based on automaton

  

  • Received:2010-03-15 Revised:2010-05-16 Online:2010-09-03 Published:2010-09-01
  • Contact: zhao erping

基于自动机XML正则路径表达式查询研究

赵尔平1,王聪华1,雒伟群1,党红恩1,张兆基2   

  1. 1. 西藏民族学院信息工程学院
    2.
  • 通讯作者: 赵尔平
  • 基金资助:
    863计划资源环境技术重点项目;西藏民族学院科研项目

Abstract: Currently, the query technique that supports finite automaton Regular Path Expression (RPE) is valuable in the research area of eXtensible Markup Language (XML) query in semi-structured data mode. Lots of middle paths were produced by many research methods that rewrite complex RPE with "//" symbol and "*" wildcard. The authors designed an efficient XML RPE query disposing method—Cutting Schema Automaton Snippet (CSAS) using Object Exchange Model (OEM) as XML data model and finite automaton as query model, put forward rewriting technology that cut out snippet of XML Schema transforming into automaton to rewrite "//" and "*", the query optimization was realized by Pruning techniques and predication back strategy. The experimental results prove that CSAS is an effect query method of RPE.

Key words: eXtensible Markup Language (XML), Regular Path Expression (RPE), automaton, cutting, query processing

摘要: 基于自动机正则路径表达式查询技术是半结构化数据模式下XML查询研究领域颇有价值的方法。许多研究方法对含有“//”操作符和“*”通配符复杂正则路径重写都会产生大量中间路径。设计了处理XML正则路径查询高效方法——CSAS,利用对象交换模型(OEM)作为XML数据模型,有限自动机作为查询模型,提出裁剪XMLSchema转化的自动机片断作为重写自动机来重写“//”和“*”符号的重写技术;利用剪枝技术、谓词处理后移策略实现查询优化。实验证明,CSAS方法是一种高效的XML正则路径表达式查询方法。

关键词: XML, 正则路径表达式, 自动机, 裁剪, 查询处理

CLC Number: