Journal of Computer Applications ›› 2015, Vol. 35 ›› Issue (2): 440-443.DOI: 10.11772/j.issn.1001-9081.2015.02.0440

Previous Articles     Next Articles

Query expansion method based on semantic property feature graph

HAN Caili, LI Jiajun, ZHANG Xiaopei, XIAO Min   

  1. School of Computer Science, South China Normal University, Guangzhou Guangdong 510631, China
  • Received:2014-09-10 Revised:2014-11-18 Online:2015-02-10 Published:2015-02-12

基于语义属性特征图的查询扩展方法

韩彩丽, 李嘉骏, 张晓培, 肖敏   

  1. 华南师范大学 计算机学院, 广州 510631
  • 通讯作者: 韩彩丽
  • 作者简介:韩彩丽(1989-),女,河南濮阳人,硕士研究生,主要研究方向:语义网、查询扩展、本体; 李嘉骏(1990-),男,黑龙江拜泉人,硕士研究生,主要研究方向:大数据计算; 张晓培(1988-),女,河南平顶山人,硕士研究生,主要研究方向:本体、语义相似度; 肖敏(1989-),女,湖南娄底人,硕士研究生,主要研究方向:语义相似度。
  • 基金资助:

    广东省自然科学基金资助项目(10151063101000031);广州市科学计划项目(2014J4100031)。

Abstract:

Because of ignoring the semantic relations between words, traditional query expansion methods cannot achieve the desired goals to expand right keywords in the nonstandard short term. Linked Data technology exploits the graph structure of RDF (Resource Description Framework) to form Linked Open Data Cloud, and provides more semantic information. In order to take into account the semantic relationships, a new query expansion method based on semantic property feature graph was proposed by combining semantic Web and graph. It used DBpedia resources as nodes to build a RDF attribute graph in which the relevance of a node was given by its relations. First, 15 kinds of semantic property weights for expressing semantic similarities between resources were obtained by supervised learning. Then, the query keywords were mapped to DBpedia resources based on the labelling properties in the whole graph of DBpedia. According to semantic features, the neighbor nodes were found out by breadth-first search and used as expansion candidate words. Eventually, the word sets with the highest relevance score values were selected as the query expansion terms. The experimental results show that compared with LOD Keyword Expansion, the proposed method based on semantic graph achieves recall of 0.89 and provides an increase of 4% in Mean Reciprocal Rank (MRR), which offers a better matching result to users.

Key words: query expansion, linked data, semantic Web, semantic property feature graph, Resource Description Framework (RDF)

摘要:

传统的查询扩展方法由于忽略了词之间的语义关系,在不规范的短小关键字上补充扩展的词已经无法达到预期目标。Linked Data技术利用资源描述框架(RDF)图模型形成Linked Open Data Cloud,能提供更多语义信息。针对查询扩展忽略语义的问题,提出了一种基于语义属性特征图的查询扩展方法。该方法将语义网与图的思想融合,利用以DBpedia资源为顶点的属性图加以扩展。首先,通过有监督的学习训练出15种语义属性特征的权重,用于表达扩展资源的有用性;然后,在整个DBpedia图上通过标签属性实现查询关键字到DBpedia匹配资源的映射;再根据属性特征广度搜索出邻接点,并将其作为扩展候选词,最后筛选出词相关行分值最高的作为最终扩展词。实验表明,与LOD Keyword Expansion方法相比,基于语义属性特征图的扩展方法召回率达到0.89,平均逆排序(MRR)提高4个百分点,与用户查询更匹配。

关键词: 查询扩展, 关联数据, 语义网, 语义属性特征图, 资源描述框架

CLC Number: