计算机应用

• 数据库技术(特约专题)(Database technology • 上一篇    下一篇

ProGen:海量数据的出处数据库生成器

张孝 王珊 廉娜   

  1. 中国人民大学 中国人民大学
  • 收稿日期:2008-08-03 修回日期:1900-01-01 出版日期:2008-11-01 发布日期:2008-11-01
  • 通讯作者: 张孝

ProGen: Provenance database generator for large-scale data set

<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>X<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>i<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>o<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>Z<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>H<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>A<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>N<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>G<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>S<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>h<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>W<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>A<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>N<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>G<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>N<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>L<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>I<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>A<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>N<a href="http://manu33.magtech.com.cn/Jwk3_jsjyy/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>   

  • Received:2008-08-03 Revised:1900-01-01 Online:2008-11-01 Published:2008-11-01
  • Contact: Xiao ZHANG

摘要: 出处对于研究人员,特别是对科学家判断数据和实验的正确性和时效性尤其重要。随着数据库视图实体化技术和数据标注/修订技术的广泛应用,出处的研究正逐渐成为一个新的研究热点。合适的出处数据集是测试出处管理的新技术/算法的功能准确性和性能的基础之一,而在获得真实数据之前能够生成尽可能真实的模拟出处数据,对验证和改进算法同样具有关键作用。给出了一个新的出处数据库生成器ProGen,能够根据数据出处所使用的关系模式和出处上的标注约束来生成所需规模的出处数据库,实验表明所给出的实现是高效、可伸缩的。

关键词: 数据库测试, 出处, 自动生成

Abstract: It is crucially important for researchers especially scientists to judge the correctness and timeliness of data and experiments according to provenance. Regarding the technologies about view materialization and data annotation, provenance has emerged to be a new research topic. Appropriate provenance data set is the foundation for verifying the accuracy and functionality of new techniques and/or algorithms on provenance management, meanwhile, the synthetic provenance data set is also of importance for verification and improvement of algorithms before gleaning the real provenance data to some expected extent. In this paper, one novel provenance database generator, ProGen was proposed, which was able to generate a provenance database, according to the input data schema and provenance annotation, with the specific data volume. The evaluation indicates that our design and implementation is efficient and scalable.

Key words: database testing, provenance, automatic generation