计算机应用 ›› 2015, Vol. 35 ›› Issue (1): 125-130.DOI: 10.11772/j.issn.1001-9081.2015.01.0125

• 数据技术 • 上一篇    下一篇

基于图数据库的在线族谱编录系统

姜洋, 彭智勇, 彭煜玮   

  1. 武汉大学 计算机学院, 武汉430072
  • 收稿日期:2014-07-18 修回日期:2014-09-19 出版日期:2015-01-01 发布日期:2015-01-26
  • 通讯作者: 彭煜玮
  • 作者简介:姜洋(1988-),男,湖北襄阳人,硕士研究生,主要研究方向:数据库内核、Web数据管理;彭智勇(1963-),男,湖北武汉人,教授,博士,主要研究方向:Web数据管理、复杂数据管理、可信数据管理;彭煜玮(1980-),男,湖北武汉人,讲师,博士,主要研究方向:数据库、地理信息系统.

Online pedigree editing system based on graph database

JIANG Yang, PENG Zhiyong, PENG Yuwei   

  1. Computer School, Wuhan University, Wuhan Hubei 430072, China
  • Received:2014-07-18 Revised:2014-09-19 Online:2015-01-01 Published:2015-01-26

摘要:

针对目前国内族谱系统中数据共享度不高、扩展性不好、编录效率较低等问题,提出并实现了一种基于浏览器/服务器(B/S)架构和图数据库的在线族谱编录系统.首先,该系统采用B/S架构,支持多人在线协同录入,提高了数据录入效率;其次,系统使用数据库存储数据,便于集中管理和统计检索,提高了数据的共享程度;然后,考虑到族谱数据具有图的结构特性,在系统中采用图数据库进行管理,大大提高了数据处理效率;最后,使用真实族谱数据进行了系统的效率对比,验证了系统的有效性.在实验中,使用了约20万人的刘氏族谱数据,对关系数据库PostgreSQL和图数据库Neo4j管理数据进行了存储和查询的效率对比.实验结果表明,Neo4j比PostgreSQL节省约50%的存储空间,而在人物后代查询、人物祖先查询、人物亲缘关系查询以及人物后代性别统计4种常见查询中,使用Neo4j的平均响应时间约为基于PostgreSQL数据库的20%、80%、16%和15%.由此可知,基于图数据库的在线族谱编录系统可用于高效处理大量族谱数据,并且支持多用户在线协同编录.

关键词: 族谱, 数字化, 浏览器/服务器(B/S)架构, 图数据库, 查询性能

Abstract:

Motivated by the poor performance of existing domestic pedigree systems on data sharing, scalability and editing efficiency, an online pedigree editing system was proposed based on Browser/Server (B/S) architecture and graph database. First, the proposed system took advantage of B/S architecture to support online collaborative entering, so as to promote data entering efficiency. Second, the system used database to store pedigrees for better management and retrieval, and promoted the data sharing. Third, the system greatly improved the efficiency of data processing, because it was managed by graph database and pedigrees are graphs in nature. Finally, the system is empirically proven to be effective through systematical experiments with real pedigree data, LIU's pedigree data, which contained over 200000 people. Specifically, the proposed system based on graph database Neo4j is 50% better than that based on relation database PostgreSQL on storage space; and the query responding time of the system based on Neo4j is respectively 20%, 80%, 16% and 15% of that based on PostgreSQL for descendant query, ancestor query, relative query and descendant gender query. According to the experimental results, a conclusion can be achieved that the system can be used to process massive pedigree data efficiently and support online collaborative entering.

Key words: pedigree, digitalization, Browser/Server (B/S) architecture, graph database, query performance

中图分类号: