计算机应用 ›› 2018, Vol. 38 ›› Issue (1): 152-158.DOI: 10.11772/j.issn.1001-9081.2017051219

• 数据科学与技术 • 上一篇    下一篇

基于耦合相关度的空间数据查询结果自动分类方法

毕崇春1, 孟祥福1, 张霄雁1, 唐延欢1, 唐晓亮2, 梁海波1   

  1. 1. 辽宁工程技术大学 电子与信息工程院, 辽宁 葫芦岛 125105;
    2. 辽宁工程技术大学 软件学院, 辽宁 葫芦岛 125105
  • 收稿日期:2017-05-19 修回日期:2017-07-17 出版日期:2018-01-10 发布日期:2018-01-22
  • 通讯作者: 毕崇春
  • 作者简介:毕崇春(1992-),男,辽宁丹东人,硕士研究生,CCF会员,主要研究方向:空间数据分析与查询;孟祥福(1981-),男,辽宁朝阳人,副教授,博士生导师,博士,CCF会员,主要研究方向:Web数据库查询、空间数据分析;张霄雁(1983-),女,山东烟台人,工程师,博士研究生,主要研究方向:时空数据库查询、城市计算;唐延欢(1992-),男,广东汕头人,硕士研究生,主要研究方向:空间数据挖掘、推荐系统;唐晓亮(1980-),男,辽宁阜新人,讲师,博士,主要研究方向:机器学习;梁海波(1995-),男,广西北海人,主要研究方向:数据挖掘、数据库查询。
  • 基金资助:
    国家自然科学基金面上项目(61772249);辽宁省教育厅一般项目(LJYL018);辽宁省自然科学基金资助项目(20170540418)。

Coupling similarity-based approach for categorizing spatial database query results

BI Chongchun1, MENG Xiangfu1, ZHANG Xiaoyan1, TANG Yanhuan1, TANG Xiaoliang2, LIANG Haibo1   

  1. 1. College of Electronic and Information Engineering, Liaoning Technical University, Huludao Liaoning 125105, China;
    2. College of Software, Liaoning Technical University, Huludao Liaoning 125105, China
  • Received:2017-05-19 Revised:2017-07-17 Online:2018-01-10 Published:2018-01-22
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61401185), the General Project of Liaoning Province Education Department (LJYL018), the Natural Science Foundation of Liaoning Province (201705 40418).

摘要: 由于空间数据库通常蕴含海量数据,因此一个普通的空间查询很可能会导致多查询结果问题。为了解决上述问题,提出了一种空间查询结果自动分类方法。在离线阶段,根据空间对象之间的位置相近度和语义相关度来评估空间对象之间的耦合关系,在此基础上利用概率密度评估方法对空间对象进行聚类,每个聚类代表一种类型的用户需求;在在线查询处理阶段,对于一个给定的空间查询,在查询结果集上利用改进的C4.5决策树算法动态生成一棵查询结果分类树,用户可通过检查分类树分支的标签来逐步定位到其感兴趣的空间对象。实验结果表明,提出的空间对象聚类方法能够有效地体现空间对象在语义和位置上的相近性,查询结果分类方法具有较好的分类效果和较低的搜索代价。

关键词: 空间数据库, 聚类, 耦合关系, 查询结果分类

Abstract: A common spatial query often leads to the problem of multiple query results because a spatial database usually contains large size of data. To deal with this problem, a new categorization approach for spatial database query results was proposed. The solution consists of two steps. In the offline step, the coupling relationship between spatial objects was evaluated by considering the location proximity and semantic similarity between them, and then a set of clusters over the spatial objects could be generated by using probability density-based clustering method, where each cluster represented one type of user requirements. In the online query step, for a given spatial query, a category tree for the user was dynamically generated by using the modified C4.5 decision tree algorithm over the clusters, so that the user could easily select the subset of query results matching his/her needs by exploring the labels assigned on intermediate nodes of the tree. The experimental results demonstrate that the proposed spatial object clustering method can efficiently capture both the semantic and location relationships between spatial objects. The query result categorization algorithm has good effectiveness and low search cost.

Key words: spatial database, clustering, coupling relationship, query result categorization

中图分类号: