《计算机应用》唯一官方网站

• •    下一篇

支持关键词搜索的top-K条最优路线查询问题

赵浩宇1,于自强2,陈晓萌2,陈国祥1,朱慧1,李博涵3   

  1. 1. 烟台大学
    2. 山东省烟台市莱阳市烟台大学
    3. 南京航空航天大学
  • 收稿日期:2023-09-15 修回日期:2023-10-12 发布日期:2023-12-18 出版日期:2023-12-18
  • 通讯作者: 于自强
  • 基金资助:
    国家自然科学基金面上项目

Top-K optimal route query problem with keyword search support

  • Received:2023-09-15 Revised:2023-10-12 Online:2023-12-18 Published:2023-12-18
  • Contact: Ziqiang ZiqiangYu

摘要: 游客来到一座陌生的城市,希望游览多个景点,并品尝当地的美食后,最终前往旅馆休息。由于上述的需求,游客希望得到多条距离近、到达自己想去全部地点并且途径的地点受到消费者欢迎的路线。这类路线查询被称为支持关键词搜索的top-k条最优路线查询。该查询涉及给定一个道路网络、道路网络上的兴趣点集合,以及一个包含起点和多个关键词的路线查询。查询的目标是找到经过与查询关键词匹配的多个兴趣点的k条最优路线,这里的评分综合考虑了路线距离和路线经过的兴趣点的热度。然而,目前该问题的一些研究为了降低算法的复杂度,将用户输入的关键词作为到达兴趣点的顺序,不适用于对兴趣点到达顺序没有要求的场景,从而难以满足用户的真实意图,降低了实用性。另一些工作则为了提高查询效率,设定距离阈值对不符合要求的兴趣点进行剪枝,但该类算法不能保证被剪枝的兴趣点一定不能组成最优路线。针对上述算法的问题,本文提出了一种关键词感知的top-k最优路线查询算法。该算法首先将整个道路网络划分为多个子网络,并采用启发式搜索策略从查询起点所在子网络开始逐步扩展搜索范围,直至找到top-k条最优路线或遍历完整个道路网络。在扩展过程中,该算法引入了子图剪枝策略,用于剪去那些不包含top-k最优路线的子网络,从而缩小搜索范围。此外,为了避免对每个可能构成最优路线的兴趣点集合依次进行计算,该算法还设计了兴趣点序列的剪枝策略,以快速过滤掉那些不可能构成最优路线的兴趣点序列,从而降低计算代价。同时,在真实数据集和合成数据集上进行了充分实验,验证了所提出的算法优于基准方法的有效性。

关键词: 道路网络, 条最优路线, 兴趣点, 关键词搜索, 个性化旅游路线

Abstract: Tourists visit an unfamiliar city with the intention of exploring multiple attractions and sampling local cuisine before eventually heading to their hotel for rest. Due to these requirements, tourists seek multiple routes that are short in distance, cover all desired locations, and are popular among consumers. Such route queries that support keyword searches for the top-k optimal routes are known as top-k optimal route queries with keyword support. This query involves providing a road network, a set of points of interest on the network, and a route query containing a starting point and multiple keywords. The objective of the query is to find k optimal routes that pass through multiple points of interest matching the query's keywords, where the score considers both the route distance and the popularity of the points of interest along the route. However, some existing research on this problem reduces the complexity of the algorithm by assuming that the user-input keywords determine the order of visiting the points of interest. This approach is not suitable for scenarios where there are no specific requirements on the order of visiting the points, thus failing to meet the user's true intent and reducing practicality. Other approaches aim to improve query efficiency by setting distance thresholds to prune points of interest that do not meet the requirements. However, such algorithms cannot guarantee that the pruned points of interest will not be part of the optimal routes. In response to the issues with the above-mentioned algorithm, this paper proposes a keyword-aware top-k optimal route query algorithm. The algorithm first divides the entire road network into multiple subnetworks and employs a heuristic search strategy, starting from the subgraph containing the query's starting point and gradually expanding the search scope until it finds the top-k optimal routes or explores the entire road network. During the expansion process, the algorithm introduces a subgraph pruning strategy to eliminate subgraphs that do not contain the top-k optimal routes, thereby reducing the search scope. Additionally, to avoid computing every possible set of points of interest that could form the optimal routes, the algorithm designs a points of interest sequence pruning strategy to quickly filter out sequences that are unlikely to form the optimal routes, thus reducing computation costs. Through the proposed algorithm, tourists can more easily plan their travel routes and obtain more optimal choices that suit their preferences. Furthermore, extensive experiments on real and synthetic datasets have been conducted to demonstrate the effectiveness of the proposed algorithm compared to baseline methods.

Key words: road network, top-k optimal routes, POI, keywords search, personalized travel route