计算机应用 ›› 2021, Vol. 41 ›› Issue (6): 1686-1693.DOI: 10.11772/j.issn.1001-9081.2020091453

所属专题: 数据科学与技术

• 数据科学与技术 • 上一篇    下一篇

距离-关键字相似度约束的双色反k近邻查询方法

张豪, 朱睿, 宋栿尧, 方鹏, 夏秀峰   

  1. 沈阳航空航天大学 计算机学院, 沈阳110136
  • 收稿日期:2020-09-18 修回日期:2020-12-09 出版日期:2021-06-10 发布日期:2020-12-18
  • 通讯作者: 张豪
  • 作者简介:张豪(1996-),男,河北石家庄人,硕士研究生,主要研究方向:大数据;朱睿(1982-),男,辽宁沈阳人,副教授,博士,CCF会员,主要研究方向:大数据;宋栿尧(1995-),男,辽宁鞍山人,硕士研究生,主要研究方向:大数据;方鹏(1994-),男,甘肃平凉人,硕士研究生,主要研究方向:新媒体大数据;夏秀峰(1964-),男,山东胶南人,教授,博士,主要研究方向:数据库、数据仓库、数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(61702344)。

Bichromatic reverse k nearest neighbor query method based on distance-keyword similarity constraint

ZHANG Hao, ZHU Rui, SONG Fuyao, FANG Peng, XIA Xiufeng   

  1. College of Computer Science, Shenyang Aerospace University, Shenyang Liaoning 110136, China
  • Received:2020-09-18 Revised:2020-12-09 Online:2021-06-10 Published:2020-12-18
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61702344).

摘要: 针对空间关键字双色反k近邻查询返回结果质量较低的问题,提出了基于距离-关键字相似度约束的双色反k近邻查询方法。首先,通过设置一个阈值将查询结果中质量较低的用户给过滤掉,从而避免了查询结果中出现空间距离相对较远的用户,保证了查询结果质量;然后,为支持该查询,提出了一种关键字多分辨率网格矩形树(KMG-Tree)索引来管理数据;最后,提出了基于Six-region算法的Six-region-optimize算法来提高查询处理效率。Six-region-optimize算法的查询效率相较baseline和Six-region算法分别平均提高了约85.71%和23.45%。基于真实时空数据进行实验测试和分析,实验结果验证了Six-region-optimize算法的有效性和高效性。

关键词: 关键字, 双色反k近邻查询, 空间距离, 相似度约束, 查询效率

Abstract: In order to solve the problem of low quality of results returned by spatial keyword bichromatic reverse k nearest neighbor query, a bichromatic reverse k nearest neighbor query method based on distance-keyword similarity constraint was proposed. Firstly, a threshold was set to filter out the low-quality users in the query results, so that the existence of users with relatively long spatial distance in the query results was avoided and the quality of the query results was ensured. Then, in order to support this query, an index of Keyword Multiresolution Grid rectangle-tree (KMG-tree) was proposed to manage the data. Finally, the Six-region-optimize algorithm based on Six-region algorithm was proposed to improve the query processing efficiency. The query efficiency of the Six-region-optimize algorithm was about 85.71% and 23.45% on average higher than those of the baseline and Six-region algorithms respectively. Experimental test and analysis were carried out based on real spatio-temporal data. The experimental results verify the effectiveness and high efficiency of the Six-region-optimize algorithm.

Key words: keyword, bichromatic reverse k nearest neighbor query, spatial distance, similarity constraint, query efficiency

中图分类号: