Journal of Computer Applications ›› 2018, Vol. 38 ›› Issue (5): 1315-1319.DOI: 10.11772/j.issn.1001-9081.2017102469

Previous Articles     Next Articles

Two-level confidence threshold setting method for positive and negative association rules

CHEN Liu, FENG Shan   

  1. College of Mathematics and Software Science, Sichuan Normal University, Chengdu Sichuan 610068, China
  • Received:2017-10-18 Revised:2017-12-05 Online:2018-05-10 Published:2018-05-24
  • Contact: 冯山
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61673285), the Natural Science Foundation of Sichuan Education Department (15ZB0029), the Sichuan Youth Science and Technology Foundation (2017JQ0046).

正负关联规则两级置信度阈值设置方法

陈柳, 冯山   

  1. 四川师范大学 数学与软件科学学院, 成都 610068
  • 通讯作者: 冯山
  • 作者简介:陈柳(1991-),女,四川邻水人,硕士研究生,主要研究方向:数据挖掘;冯山(1967-),男,重庆人,教授,博士,主要研究方向:智能教育平台软件、数据挖掘、实时数据库系统。
  • 基金资助:
    国家自然科学基金资助项目(61673285);四川省教育厅自然科学重点基金资助项目(15ZB0029);四川省青年科技基金资助项目(2017JQ0046)。

Abstract: Aiming at the problem that traditional confidence threshold setting methods for positive and negative association rules are difficult to limit the number of low-reliability rules and easy to miss some interesting association rules, a new two-level confidence threshold setting method combined with the rule's itemset correlation was proposed, called PNMC-TWO. Firstly, taking into account the consistency, validity and interestingness of rules, under the framework of correlation-support-confidence, on the basis of the computation relationship between rule confidence and itemset support of the rule, the law of confidence of rule changing with support of itemsets of the rule was analyzed systematically. And then, combined with the user's requirement of high confidence and interesting rules in actual mining, a new confidence threshold setting model was proposed to avoid the blindness and randomness of the traditional methods when setting the threshold. Finally, the proposed method was compared with the original two-threshold method in terms of the quantity and quality of the rule. The experimental results show that the new two-level threshold method not only can ensure that the extracted association rules are more effective and interesting, but also can reduce the number of low-reliability rules significantly.

Key words: data mining, positive and negative association rules, rule confidence threshold, itemset correlation

摘要: 针对传统正负关联规则置信度阈值设置方法难以控制低可信度规则数量和易遗漏有趣规则的问题,提出了一个结合项集相关性的两级置信度阈值设置方法(PNMC-TWO)。首先,基于规则的无矛盾性、有效性和有趣性考虑,以相关度-支持度-置信度为框架,从规则置信度与项集支持度的计算关系出发,系统地分析了正负关联规则置信度取值随规则的项集支持度大小变化的规律;然后,与实际挖掘中用户对高可信度且有趣的规则需求相结合,提出了一个新的设置模型,避免了传统方法设置阈值时的盲目性和随意性;最后,从规则数量和规则质量两方面对所提方法与原双阈值法进行了实验对比。实验结果表明,所提方法不仅可以更好地确保提取出的关联规则有效和有趣,还可以显著地降低可信度低的关联规则数量。

关键词: 数据挖掘, 正负关联规则, 规则置信度阈值, 项集相关性

CLC Number: